Continuing the discussion from Fallible replace_all for regular expressions:
I'm unsure when using closures in a Rust API is a good choice or not. I don't have a concrete problem right now, but I've repeatedly stumbled upon a few cases like this one regarding the regex
crate but also some cases in my own crates (e.g. that one but also some other proprietary ones) where using closures caused or would cause some limitations, in particular:
- the
?
operator won't work, - an
await
expression isn't allowed,
unless certain measures are taken, e.g. the closure wrapping its return value in a Result
or BoxFuture
.
I also expect to run into these issues again in the future.
To get away from a particular use case and to be able to discuss this issue in a more general fashion, let me consider the following toy example (Playground for this and all following examples):
const KEYS: [&str; 2] = ["one", "two"];
pub struct UsesCallback {
output: Vec<String>,
}
impl UsesCallback {
pub fn new<F: FnMut(&str) -> String>(callback: F) -> Self {
let output = KEYS.iter().copied().map(callback).collect();
Self { output }
}
pub fn into_inner(self) -> Vec<String> {
self.output
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let uses_callback = UsesCallback::new(|s| {
let first_char_option: Option<char> = s.chars().next();
// We can't `await` here.
//tokio::task::yield_now().await;
// We can't return an error here:
//let first_char: char = first_char_option.ok_or("empty string")?;
// We must panic on error:
let first_char: char = first_char_option.unwrap();
format!("{s} begins with '{first_char}'")
});
assert_eq!(
uses_callback.into_inner(),
vec!["one begins with 'o'", "two begins with 't'"]
);
}
Here, the UsesClosure::new
function expects a closure to map a &str
to a String
. It's not possible to use await
or return an error there.
We could allow async
by providing a new_async
method and using pinned, boxed futures:
use futures::future::BoxFuture;
impl UsesCallback {
pub async fn new_async<F>(mut callback: F) -> Self
where
F: FnMut(&str) -> BoxFuture<String>,
{
let mut output = Vec::with_capacity(KEYS.len());
for key in KEYS.iter().copied() {
output.push(callback(key).await);
}
Self { output }
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let uses_async_callback = UsesCallback::new_async(|s| {
Box::pin(async move {
let first_char_option: Option<char> = s.chars().next();
// Using `await` is possible here:
tokio::task::yield_now().await;
// We can't return an error here:
//let first_char: char = first_char_option.ok_or("empty string")?;
// We must panic on error:
let first_char: char = first_char_option.unwrap();
format!("{s} begins with '{first_char}'")
})
})
.await;
assert_eq!(
uses_async_callback.into_inner(),
vec!["one begins with 'o'", "two begins with 't'"]
);
}
As we can see, failing (other than panicking) is still not possible here. If we need that, we could instead do:
impl UsesCallback {
pub fn try_new<F, E>(mut callback: F) -> Result<Self, E>
where
F: FnMut(&str) -> Result<String, E>,
{
let mut output = Vec::with_capacity(KEYS.len());
for key in KEYS.iter().copied() {
output.push(callback(key)?);
}
Ok(Self { output })
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let uses_fallible_callback = UsesCallback::try_new(|s| {
let first_char_option: Option<char> = s.chars().next();
// We can't `await` here.
//tokio::task::yield_now().await;
// Returning an error is possible here:
let first_char: char =
first_char_option.ok_or("empty string")?;
Ok::<_, Box<dyn std::error::Error>>(format!(
"{s} begins with '{first_char}'"
))
})?;
assert_eq!(
uses_fallible_callback.into_inner(),
vec!["one begins with 'o'", "two begins with 't'"]
);
}
Finally, if we want to be able to do both, we'd have to do something like:
impl UsesCallback {
pub async fn try_new_async<F, E>(mut callback: F) -> Result<Self, E>
where
F: FnMut(&str) -> BoxFuture<Result<String, E>>,
{
let mut output = Vec::with_capacity(KEYS.len());
for key in KEYS.iter().copied() {
output.push(callback(key).await?);
}
Ok(Self { output })
}
}
let uses_async_fallible_callback =
UsesCallback::try_new_async(|s| {
Box::pin(async move {
let first_char_option: Option<char> = s.chars().next();
// Using `await` is possible here:
tokio::task::yield_now().await;
// Returning an error is possible here:
let first_char: char =
first_char_option.ok_or("empty string")?;
Ok::<_, Box<dyn std::error::Error>>(format!(
"{s} begins with '{first_char}'"
))
})
})
.await?;
assert_eq!(
uses_async_fallible_callback.into_inner(),
vec!["one begins with 'o'", "two begins with 't'"]
);
}
Is this idiomatic yet?
Maybe it's better to avoid closures here entirely, and try something like the following?
use std::cell::RefCell;
pub struct UsesIterator {
output: Vec<String>,
}
impl UsesIterator {
pub fn into_inner(self) -> Vec<String> {
self.output
}
}
pub struct UsesIteratorBuilder {
output: RefCell<Vec<String>>,
}
impl UsesIteratorBuilder {
pub fn new() -> Self {
Self {
output: RefCell::new(Vec::with_capacity(KEYS.len())),
}
}
pub fn iter(&self) -> impl Iterator<Item = &str> {
KEYS.iter().copied()
}
pub fn push(&self, s: String) {
let mut output = self.output.borrow_mut();
if !output.len() < KEYS.len() {
panic!("pushed too many");
}
output.push(s);
}
pub fn build(self) -> UsesIterator {
let output = self.output.into_inner();
if output.len() != KEYS.len() {
panic!("pushed too few");
}
UsesIterator { output }
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let builder = UsesIteratorBuilder::new();
for s in builder.iter() {
let first_char_option: Option<char> = s.chars().next();
// Using `await` is possible here:
tokio::task::yield_now().await;
// Returning an error is possible here:
let first_char: char =
first_char_option.ok_or("empty string")?;
builder.push(format!("{s} begins with '{first_char}'"));
}
let uses_iterator = builder.build();
assert_eq!(
uses_iterator.into_inner(),
vec!["one begins with 'o'", "two begins with 't'"]
);
Ok(())
}
(Playground for all examples in this post)
Not sure if the last example could be written cleaner, without using a RefCell
.
Some questions:
- What are your experiences with these problems?
- Do you try to avoid closures for these (or other) reasons in your APIs?
- Do you often end up duplicating code for
async
and non-async interfaces? - Do you often end up duplicating code for fallible and infallible interfaces?
- Have you experienced trouble with using third party APIs that don't take the
async
or fallible cases into account? - Is it idiomatic to work around closures and try to program less functional-style and more procedural-style in order to not get into trouble with
?
andasync
? E.g. by usingVec::with_capacity
and afor
loop thatpush
es items instead of using functional-stylemap
ping? Should closures perhaps even be avoided in API design (unless you're sure they cover all use cases)? - What's current best practice in regard to considering
async
or fallible code when designing an API? - What are the future prospects in this matter?