Closures in API design – theoretical limitations and best practices?

But like I said:

For example, consider the very common Iterator::map method. It takes an FnMut closure, that can have any side effects (but not alter control flow of the map method). The aforementioned "feature" of disallowing an early return is not really something that's needed/wanted/nice here. Yet it's very very idiomatic to use the map method. A lot of people recommend functional-style code. But you run into problems:

fn functional_style() -> Option<Vec<u8>> {
    Some((20u8..=25)
        .map(|i| i.checked_mul(10).unwrap() /* can't use `?` here */ )
        .collect()
    )
}

(Playground with comparison to procedural style code)

edit: @H2CO3 shows that there is no problem in this particular case, see the response below.

In other cases, this limitation is desired, e.g. with the replace_with crate or when we have scoped threads. But I feel like in the vast majority of cases (99% ?) where closures/callbacks are used in an API, we do not want these limitations (such as shown in the case of map).

That said, I don't want to say map should generalize over any sort of "effect" or be async, etc. But it makes me wonder if recommendations I stumble upon here in this forum to write some things functional style are really a good idea. What if using map works fine, and then later I need to refactor my code to do an early return in some cases? The same holds when I offer an API that expects a callback (which caused me asking the question in the other thread about regex not supporting fallible replacements, which is a practical issue [1] and not a theoretical one).

I thought about the term "effects" too, but I thought that this also covers mutation? E.g. the Monad in Haskell is often used to describe I/O operation (or other, possibly more limited side-effect). That's something any FnMut closure can do in Rust. We don't need Monads for it.

So maybe "effects" is a too broad term here (unless we are in a functional programming language, which Rust is not)?

I wonder if I ran into exactly this when I attempted to get around the latter by deciding to simply return a Vec of Futures here (instead of having to use some sort of asynchronous iterator, which I didn't even know about… oh wait, that's what's previously was called Stream in the futures crate, right?).

Well, I don't know if I want the language to generalize over and propagate effects. I would like to avoid having to duplicate code :sweat_smile:. Or having to use ugly solutions like that one. Especially when dealing with async (e.g. in the multivariate_optimization crate, which is mostly an experiment for now), I feel like I end up with a lot of overhead/duplication because of async. I would like to gain knowledge to not run into these issues all the time.

I don't understand this. Could you rephrase it? Note that while I am fascinated by functional programming, I don't have that much knowledge in or experience with it.

Yeah, I have been doing that in past. You're saying: Just ignore being async time to time, and let the caller deal with it using threading? Maybe it could help me get around a lot of headache involving async Rust. But… hmmmm… not sure if it's that nice. But I'll definitely keep that in mind as an escape hatch when things get too complex.

I don't think I'm "more" interested in the theory than in practice. [2] I am interested in both. The reason for writing this post are practical issues I run into (repeatedly). These can be as simple as being unable to use the ? operator in a closure passed to .map, or end up being very complex, such as refactoring hundreds of lines of code because I figured that something I wanted to do won't work well with async. Nonetheless, I am also interested in the theory behind this problems because I like to extend my knowledge in that matter.

Why can't I be interested in both? I tried to express that by writing "theoretical limitations and best practices" in the title.

I know I rather tend to use generics or abstract data structures too often, and part of my own learning process is to abstain from trying to be too generic. It's good to be reminded time to time that sometimes solutions like

  • spawn_blocking instead of pure async,
  • Vec<T> instead of Box<[T]> instead of impl IntoIterator<Item = T>,
  • String instead of Cow<str> instead of Borrow<str>,
  • [T] instead of Index<T>,

cause a lot of less problems, especially as sometimes traits in the std library seem to be implemented in a way that doesn't always work well, e.g. in regard to AsRef or in regard to Index.

I still enjoy exploring generics, mostly to learn more about Rust and its limitations (and possibly also about some of its design flaws, and how to work around them and/or to avoid them!).

Maybe bringing up generics on this forum sometimes has the same effect as including it in your source code. :see_no_evil:

I mean seriously: I hope it's okay I ask these questions. I don't think it should be responded to in such a negative way as it sometimes happens here. Isn't this forum big enough for both practical issues and theory? Or should I really stop asking these questions here? Where should I go then?


  1. Sure, the API can offer a "method" that consists of copy & pasting code from the docs to solve this. This might be okay in case of regex::Regex::replace_all as it's "just" 16 lines of redundant code to be copy & pasted assuming you don't need any optimizations). But regex is just an example here. ↩︎

  2. Maybe I'm interested "more" in the theory than many other people. ↩︎

1 Like

You are holding it totally wrong.

It's not that there's an "undesirable restriction of control flow". The right question to ask is exactly the opposite: why should there be an additional feature to regulate control flow here? Iterators are already lazy, and the fallibility handling is built into the container. Your "ugly" functional example is perfectly expressible as

fn functional_style() -> Option<Vec<u8>> {
    (20u8..=25)
        .map(|i| i.checked_mul(10))
        .collect()
}

And this is the correct way to do it: whoever collects the iterator has the opportunity to simply stop calling next() if that's what s/he pleases, and this is exactly what Option and Result's FromIterator impls do. There's absolutely zero need to run additional circles in order to do exactly the same thing with different syntax.

TL;DR: learn the standard library, don't blame the language.

4 Likes

Interesting, I didn't know that. (Playground)

I suspect it's this impl in std:

impl<A, V> FromIterator<Option<A>> for Option<V>where
    V: FromIterator<A>,

This might come in handy in a couple of cases. Thanks for that hint.

I guess it only works though, because collect is very generic. So here, generics (in std) help to avoid friction.


Maybe the example was a bad one then. But I would still run into problems if I attempted to do anything async, right? (edit: see update below) And in case of Regex::replace_all, the problem of lack of fallibility does exist.

Perhaps I still have no good "feeling" in regard to when a closure/callback does cause problems, and when it doesn't.


Update:

Regarding .map and async, there remains the question of the ordering of realizing the futures. I guess that's what FuturesOrdered and FuturesUnordered are for. So it is possible to write the example functional style if you use these. It requires a bit of Option/Result gymnastics though, it seems:

use futures::stream::{FuturesOrdered, TryStreamExt};
use tokio::task::yield_now;

async fn functional_style() -> Option<Vec<u8>> {
    FuturesOrdered::from_iter((20u8..=25).map(|i| async move {
        yield_now().await;
        i.checked_mul(10).ok_or(())
    }))
    .try_collect()
    .await
    .ok()
}

(Playground)

I think you in particular are experiencing some push-back because (insofar as my memory recalls accurately) many of your threads start out sounding like an ordinary “practical issue”, and then as the question gets refined, turn into something more like “I want to make the code I am writing as uncompromisingly generic as possible”. Many people do not consider that really a “practical” matter; such genericism often trades off with usability, comprehensibility of code and of errors, compilation time, etc.

It's also not taken well when a “practical” discussion turns into “oh well, looks like Rust's design is broken” — while of course there are in fact flaws, there are a lot of threads where once such an observation has been made, the person making it doesn't even want to talk about alternatives, and it gets tiresome, especially when the design in question is a tradeoff which provides some other benefit and not simply a mistake in hindsight.

I don't mean to say that you have done anything wrong or that there should not be a place for the discussions you want to have; rather, I want to highlight a possible source of tension and how the topics you are interested in can be perceived from other perspectives. Thoughtfully framing your posts (and separating the “theory” thread from the “practical” thread, as you did in this case) can help.

9 Likes

I often like to make my code generic where possible. I don't think that's bad per-se. (Or at least it shouldn't be a bad thing to do.) And when I experience friction, I like to understand why there is friction or what's the underlying cause (and when and why to refrain from being generic).

During that process, I sometimes discover that there are some flaws in the language or in std. When I find them, I try to submit bug reports. I consider that to be pretty constructive. Though, of course, I may also make mistakes in that process (like anyone can do), both in regard to technical matters as well as organizational or communication errors.

But thank you for sharing the perception.

Well, I would appreciate if frustrations with other people aren't projected onto me. I like to discuss alternatives and learn.

I thought that having "theoretical limitations" in the subject was pretty clear. But maybe not clear enough.

Anyway, thanks for your response. I honestly appreciate your time and effort both in regard to the technical subject as well as the meta questions regarding the forum. :pray:

1 Like

I finally looked into that. Looks interesting, even if not nearly as production ready as Rust, it seems. But it might help me to better understand certain things by exploring that language. Aside from looking like fun. Thank you.

ouroboros goes a steps further and also has versions for send vs. not send for the async cases. If you use if for your self-referencing structs, you’ll earn a total of 9 constructors and 6 builder structs :exploding_head:

I don’t mean to imply this observation has anything to do with being truly idiomatic; maybe it’s more about being truly general.

Meta: Just to clarify once more, I'm interested both in the theoretical aspects of this question and I have particular problems with code I'm writing [1] (I'll get back to that in a sec). I would like to discuss both aspects, and I don't feel like it's (always) appropriate to open two or three threads on the same topic just to highlight different aspects of that topic. I do understand if not everyone likes to discuss everything, and in that case, I would like to emphasize that it's not necessary to engage in a theoretical discussion if one doesn't want to. There is no need to emphasize that a discussion is "useless", "always leads to the same result", etc. For my part, even having more than a year of experience with Rust, I still consider myself being a beginner, and I'm in the process of learning.

When I say that a language construct has an "unnecessary constraint" etc, then I do not (always) want to imply that the language should be changed. I merely share the observation that in certain contexts(!) a constraint seems rather unhandy than helpful, while knowing that in other cases it may be the opposite. I will try to emphasize this in the future to avoid misunderstandings. Sometimes, but not always, this can lead to bug reports or PRs. In the case discussed here, I did not want to imply that Rust's closures should transparently forward effects or that the language should be changed in any manner. I did, however, try to identify possible limitations that you may run across in everyday's programming that may be useful to keep in mind when programming idiomatic Rust.

In the future, I will also try to take more care in the future to give context for my posts, even if that may be redundant eventually, and to note when there is a shift in interest as a discussion goes on. I feel like a forum like this should offer space for both, and there is no need to tell other people what to discuss or not to discuss as long as it's not clearly out of scope. For now, I'm interested both in the theoretical and the practical aspects.

I apologize for any misunderstandings in the past.


That said, I would like to add a few more things to this topic.

The most recent issue I ran into (regarding closures) is this one: multivariate_optimization::Solver. I already tried to reduce the code that exhibits effects (async in that case) to two methods:

and to make the new method indifferent (generic) in that matter. I'm not sure if that was or is a good design.

Either way – now when actually using that code, I noticed that creating a specimen can fail (in my particular case it will involve running a seperate process, namely NEC2). So I would need to provide at least two more methods for extend_specimens and replace_worst_specimens (and maybe four, if I want to handle the fallible sync and async cases).

And if I care about Send and !Send futures, this will get worse.

So I thought on @kpreid's wording "loosely coupled" here:

In my case, I consider that the caller should be responsible for converting a specimen's parameters into the specimen, as shown here in the drafted documentation example:

let constructor = |params: Vec<f64>| {
    let cost = rastrigin(&params);
    BasicSpecimen { params, cost }
};
let into_specimens =
    |x: Vec<Vec<f64>>| x.into_par_iter().map(constructor).collect::<Vec<_>>();

// `into_specimens` may be replaced with another mechanism, e.g. an `async` one
// or, if no parallel computation is desired, one can simply use:
// let into_specimens = |x: Vec<Vec<f64>>| x.into_iter().map(constructor);

let mut solver = Solver::new(search_space);
solver.set_speed_factor(0.5);

let initial_params = solver.random_params(POPULATION);
solver.extend_specimens(into_specimens(initial_params));
/* … */

That is done in the into_specimens closure (which is actually a function as it doesn't capture any environment).

I would say that this approach is more "loosely coupled", but I'm not sure if I'm really happy with it, or what to do.


@steffahn I feel like duplicating all the methods like in ouroboros isn't a very nice way to go. (Though maybe macros could help out? Not sure if that's better though.) Maybe it's the current "best practice" though.

Looking at this from a language p.o.v.: Perhaps that's why "keyword generics" came up in the first place (even though they seem to be controversial, I guess)? What I also don't understand is if (or how) they are supposed to solve the effect of exceptions? That's not done via keywords in Rust but using Result or, in the more generic case with RFC #3058, with the yet unstable Try trait and the (already stable) std::ops::ControlFlow type.

In this context, I would also like to note that the solution in regard to async operations in closures passed to Iterator::map doesn't handle the Option case (opposed to the Result case) very well, because TryStreamExt is only implemented for TryStream, which is only implemented for for S where S: Stream<Item = Result<T, E>> + ?Sized. Thus the .ok_or() and .ok() workaround here:


Theoretical questions aside (though I appreciate any answers on that matter as well), what do I do in my concrete case in regard to the multivariate_optimization crate? Try "loosely coupling", or should I rather duplicate the methods and provide try_…_async, try_…_async_send, etc. variants? Any other option?

Side note, slightly off-topic …

… but maybe of interest: For the fun of it, I experimented with Monads in the past, though I would like to emphasize that I do not see those as a solution for this problem (or effect handling in general). It's been more a thought experiment. I ran into a couple of problems, like here and there, and even hanged the compiler (#113359). In either case, this is far from idiomatic Rust and I don't want to imply that there is any practical use for those experiments.


P.S./Meta:

I just noticed that I mixed up theory and my practical issue, disregarding your advice, in this post. :face_with_diagonal_mouth: Is that really such a big problem? (Not wanting to imply you said it was a big problem.) Should I have opened yet another (third) thread on this (i.e. the multivariate_optimiation issue)? I don't think that would be a good way to go. :man_shrugging:


  1. Actually this has happened more than once already, but I don't remember all the cases and/or scenarios. Maybe it didn't happen that often yet. I don't know/remember. One other recent case was the issue regarding regex that I linked in my OP, but that's not the one I'm thinking on at this moment. ↩︎

1 Like

One possible practical compromise is to offer two choices: one which is the above and one which is minimally flexible, just offering the Iterator::map()-style callback:

solver.create_specimens(constructor);

Then simple applications can have a simple interface, and ones which want some refinement can use the more complex, possibly fragile one. (Fragile in that it opens many questions of what happens if it is used differently than the intended pattern; possibly those have simple answers in the particular case.)

1 Like

Disclaimer: With this post, I want to explore the generic solution further. I do not want to imply this is idiomatic Rust, though it might be an interesting approach where boxing is too expensive. It might also be helpful to understand the value and/or limitations of certain unstable features better.

Do we really need pinning and boxing here? :innocent:

I tried to apply a technique I previously proposed for regex to the 4th toy example in my OP:

#![feature(
    closure_lifetime_binder,
    try_trait_v2,
    type_alias_impl_trait
)]

use std::future::Future;
use std::ops::{FromResidual, Try};

const KEYS: [&str; 2] = ["one", "two"];

mod internal {
    use std::future::Future;

    pub trait Callback<A, O>
    where
        Self: FnMut(A) -> <Self as Callback<A, O>>::Future,
    {
        type Future: Future<Output = O>;
    }

    impl<'a, C, F, A, O> Callback<A, O> for C
    where
        C: FnMut(A) -> F,
        F: Future<Output = O>,
    {
        type Future = F;
    }
}

pub struct UsesCallback {
    output: Vec<String>,
}

impl UsesCallback {
    pub async fn try_new_async<F, R1, R2>(mut callback: F) -> R2
    where
        F: for<'a> internal::Callback<&'a str, R1>,
        R1: Try<Output = String>,
        R2: Try<Output = Self> + FromResidual<<R1 as Try>::Residual>,
    {
        let mut output = Vec::with_capacity(KEYS.len());
        for key in KEYS.iter().copied() {
            output.push(callback(key).await?);
        }
        R2::from_output(Self { output })
    }

    pub fn into_inner(self) -> Vec<String> {
        self.output
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    type R<'a> = impl 'a
        + Future<Output = Result<String, Box<dyn std::error::Error>>>;
    let uses_async_fallible_callback =
        UsesCallback::try_new_async::<
            _,
            _,
            Result<_, Box<dyn std::error::Error>>,
        >(for<'a> |s: &'a str| -> R<'a> {
            async move {
                let first_char_option: Option<char> = s.chars().next();

                // Using `await` is possible here:
                tokio::task::yield_now().await;

                // Returning an error is possible here:
                let first_char: char =
                    first_char_option.ok_or("empty string")?;

                Ok::<_, Box<dyn std::error::Error>>(format!(
                    "{s} begins with '{first_char}'"
                ))
            }
        })
        .await?;
    assert_eq!(
        uses_async_fallible_callback.into_inner(),
        vec!["one begins with 'o'", "two begins with 't'"]
    );
    Ok(())
}

(Playground) (Playground) (Playground)
edit #1: renamed Callback::Output to Callback::Future to avoid confusion with Future::Output.
edit #2: defined internal::Callback more generic, which makes the code easier to read, in my opinion.

As we can see, using BoxFuture wasn't neccessary after all! :astonished:

I wonder: Has this technique been used and/or described previously?

It does avoid boxing for the sole purpose of specifying the lifetime. But it comes at a high "syntactic price". I don't want to imply it's a good idea to do this (in most cases), but maybe it could be interesting to investigate this further, in order to get rid of BoxFuture (in the future :wink:) where it's simply used to do the lifetime handling (and is not really needed).

Technical question: Am I correct that this approach is Send/!Send agnostic?

The next step (if we truly want a generic solution here) would be to consider this:

I tried to be generic regarding fallibility as well by using Try and FromResidual (instead of Result) in combination with avoiding boxing the future. But the syntax complexity exploded in such a way that I couldn't solve it (yet). Update: Solution below:

#![feature(
    closure_lifetime_binder,
    try_trait_v2,
    type_alias_impl_trait
)]

use std::future::Future;
use std::ops::{FromResidual, Try};

const KEYS: [&str; 2] = ["one", "two"];

mod internal {
    use std::future::Future;

    pub trait Callback<A, O>
    where
        Self: FnMut(A) -> <Self as Callback<A, O>>::Future,
    {
        type Future: Future<Output = O>;
    }

    impl<'a, C, F, A, O> Callback<A, O> for C
    where
        C: FnMut(A) -> F,
        F: Future<Output = O>,
    {
        type Future = F;
    }
}

pub struct UsesCallback {
    output: Vec<String>,
}

impl UsesCallback {
    pub async fn try_new_async<F, R1, R2>(mut callback: F) -> R2
    where
        F: for<'a> internal::Callback<&'a str, R1>,
        R1: Try<Output = String>,
        R2: Try<Output = Self> + FromResidual<<R1 as Try>::Residual>,
    {
        let mut output = Vec::with_capacity(KEYS.len());
        for key in KEYS.iter().copied() {
            output.push(callback(key).await?);
        }
        R2::from_output(Self { output })
    }

    pub fn into_inner(self) -> Vec<String> {
        self.output
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    type R<'a> = impl 'a
        + Future<Output = Result<String, Box<dyn std::error::Error>>>;
    let uses_async_fallible_callback =
        UsesCallback::try_new_async::<
            _,
            _,
            Result<_, Box<dyn std::error::Error>>,
        >(for<'a> |s: &'a str| -> R<'a> {
            async move {
                let first_char_option: Option<char> = s.chars().next();

                // Using `await` is possible here:
                tokio::task::yield_now().await;

                // Returning an error is possible here:
                let first_char: char =
                    first_char_option.ok_or("empty string")?;

                Ok::<_, Box<dyn std::error::Error>>(format!(
                    "{s} begins with '{first_char}'"
                ))
            }
        })
        .await?;
    assert_eq!(
        uses_async_fallible_callback.into_inner(),
        vec!["one begins with 'o'", "two begins with 't'"]
    );
    Ok(())
}

(Playground)
All Playgrounds tested successfully with 1.73.0-nightly (2023-07-29 32303b219d4dffa447aa).

For obvious reasons, type inference can't handle this anymore, so there is an extra type annotation needed in main to specify which Try type we expect as the Output of the Future returned by try_new_async.

Actually the possibilities in Koka inspired me to try to find this generic solution above (many thanks for pointing me to that interesting language). My idea was that the part that does not require continuations can be handled by Try/FromResidual. The continuations [1] need to be handled by Futures/async though. As shown above, it seems to be possible to combine these in Rust, but it gets pretty ugly.

If I understand right, then my most recent variant of try_new_async is generic in regard to the Futures used (i.e. the Future can be Send or !Send and its type doesn't need to be named) and it is also generic in regard to the ControlFlow mechanism to handle an early exit.


  1. Actually we don't have real (or "arbitrary") continuations in Rust as the unnameable Futures won't be Clone + Unpin. See also this post above by @kpreid. ↩︎

I’ve come across the observation that usage of BoxFuture is a workaround solution to get async callbacks on stable Rust, multiple times, with the implications that the boxing and type erasure of course ought to not be necessary. Your code demonstrates the need for unstable features to make this work quite well, otherwise the compiler tends to complain when the callback is anything but a simple async fn. See e.g. here.

It’s interesting to see your code though; I’ve never experimented with approaches to make passing a (possibly capturing) closure work with unstable features, as you do in annotating the closure signature using type-alias-impl-trait.


Indeed avoiding the type erasure should also make it handle auto traits.

2 Likes

That looks pretty similar (or even almost identical?) to what I proposed in the above linked PR draft #1048 for regex: You make Fn(Mut/Once) a supertrait of an own trait with a blanket implementation. (Though you applied it to Futures already.)

Note that the linked PR draft does not require unstable features (you can use a coercing function to specify the closure's type). Maybe that could help getting rid of the TAIT (type-alias-impl-trait) in the above Playground too, but I haven't tested that.

:smiley:

I think, I don’t see how that helps much, as the coerce function relies on being able to provide a more concrete Fn…(…) -> … bound that expresses the true signature of the closure to help out type inference. Whereas with the case of returning a future, you can never properly write the returned Future type, so writing a good signature for coerce is just as problematic (unless it’s a nameable future type, which is the whole point of using the BoxFuture workaround; or, apparently, as you demonstrated, unless TAIT is used).

You are probably right. I didn't fully understand your reasoning yet but I noticed that a coercing function probably wouldn't do anything else than the bound in the already existing method.

FWIW, it's possible to get rid of the closure_lifetime_binder feature though. (Playground)

1 Like

We can work around fix this by using the unstable feature try_trait_v2_residual (#91285) and by forcing the "fallibility wrapper" (e.g. Result, Option, etc.) to be identical to the one used by the Future::Output of the return type of the closure:

    pub async fn try_new_async<F, R>(
        mut callback: F,
    ) -> <R::Residual as Residual<Self>>::TryType
    where
        F: for<'a> internal::Callback<&'a str, R>,
        R: Try<Output = String>,
        R::Residual: Residual<Self>,
        <R::Residual as Residual<Self>>::TryType:
            Try<Output = Self> + FromResidual<<R as Try>::Residual>,
    {
        let mut output = Vec::with_capacity(KEYS.len());
        for key in KEYS.iter().copied() {
            output.push(callback(key).await?);
        }
        <R::Residual as Residual<Self>>::TryType::from_output(Self {
            output,
        })
    }

(Playground) (Playground) edit: Also got rid of unnecessary turbofish type annotation in main.

But… well…

The goal of my OP was to better understand the theoretical limitations of Rust's closures in API design as well as finding best practices to work and live with these limitations. After having shown in my previous post how a callback-based API could theoretically be modeled with (unstable) Rust such that it allows for fallibility and async, I would like to try to describe the nature of callbacks a bit better. I will attempt to do this to get a better understanding of the limitations first, and maybe (hopefully?) figure out best practices to deal with it.

Callbacks seem to be used when some code A initially calls some code B, and B wants or needs to have power over the control flow regarding parts of A.

This isn't just about (side-)effects: As Rust is a non-functional language, all functions can have side-effects, either because they are FnMut or because they can resort to inner mutation (e.g. using Arc<Mutex<_>>). And every function may also have I/O effects. But the interesting effects are related to control flow, where we can observe the following ones:

  • Aborting: Using callbacks or not, this is always possible, e.g. by panicking when there is no catch_unwind or by using the unstable always_abort (or by panicking in a drop handler, etc).
  • Diverging (non-termination): As we're touring complete, this seems to also be always possible in Rust.
  • Early-exit (exceptions): If we want to exit early (possibly beyond a single function body), we have two ways to do this in Rust:
  • Yielding (async, coroutines, generators, etc.): This gets modeled by returning a Future instead of the desired output value.
  • Arbitrary continuations: These don't exist in Rust. So nothing to worry about here.

Now there is potential friction when the code in A wants to use some of these effects while code of B is in the way on the stack.

  • Aborting and diverging doesn't cause any problems.
  • Early-exit can work in two ways. Using catch_unwind is usually not a viable option though. Instead, the API needs to wrap return values in a Result or some other Try type. Usually, a Result will be used, but this isn't generic! As shown in my previous post, the generic (but possibly syntactically horrible) way to express this would be to use the Try and Residual traits.
  • Yielding requires all involved functions and/or closures to return a Future, which is in many cases automatically achieved through the async keyword. Where we can't use an async fn but must use an async block expression, we can run into problems because the created Futures are unnameable except if they are boxed. This makes it hard (but not impossible, as also shown in the previous post) to capture a lifetime in the future's type. In most everyday code, we would use BoxFuture or LocalBoxFuture, which forces us to decide on whether we want to support (and demand) Futures that implement Send.

So much for the theory. But what does that mean in practice?

Unfortunately [1], I think it will depend on a case-by-case basis. Where we don't really need the control flow management, we could try a "loosely coupled" approach by avoiding callbacks overall or only providing a callback-based API for the most simple cases (without fallibility or async support). This is basically what regex does and what @kpreid suggested here.

But sometimes the functionality that a crate provides actually is about the control flow. I.e. the main point of a crate is to implement algorithms that cause a certain complex control flow. In this cases, I see four five options:

  1. Provide limited support for fallible and async: E.g. support none or only one of those; or only support Send but not !Send futures. This keeps API complexity low but also limits the capabilities of the API. Where necessary, bridges between sync and async (as already mentioned by @CAD97 here) can be employed. This is more tricky in regard to fallibility: Here we could use catch_unwind, but I feel like this is somewhat painful.
  2. Duplicate methods: For example, provide a foo, try_foo, foo_async, try_foo_async, foo_async_send, try_foo_async_send method, instead of just one foo method. I feel like in practice, we won't be generic in regard to the Try types and just use Result. This means that if we use some other type than Result, e.g. Option, we must work around that, as previously shown here.
  3. Offer copy&paste-able code: This is also what regex does here.
  4. Use macros: We could consider to use macros instead of functions/methods to offer certains forms of control flow. It's not elegant but might be the least pain in many cases.

Edit: I forgot [2]:

  1. Be generic (not considered as idiomatic in many cases, but I want to mention it at least): One could omit a foo method in the first place and only offer a (most generic) try_foo_async(_send) method. Again, bridging between sync and async is possible (though it requires an async executor, e.g. futures::executor::block_on even if you don't want to be async at all). Wrapping an infallible result T in a Result<T, Infallible> should be relatively easy. Most tricky, I believe, would be the Send/!Send property of futures: one could either only support one of these (likely the Send variant), duplicate the API here, or use unstable features (TAITs, which seem to be needed, as pointed out before) to work around that. But being generic will still reduce ergonomics in many cases. Edit #2: Also, using async-only isn't truly generic, as it will come with a runtime overhead for the (bridged) sync case (which may or may not be solved with keyword generics in the future).

TL;DR

It depends.


  1. I wished this wasn't the case and there would be an elegant way to generically handle all effects (including control flow) in Rust, but I'm not optimistic it will happen or even could happen without breaking a lot of things or making things even more awkward. I feel like the existing solutions in Rust are too complex already to be well-understood by the majority of programmers (or to be used by advanced programmers without creating syntactically horrible code). Adding things like Residual or keyword generics might make things even worse. Or not? I'm undecided. But consider how much headache Pin can cause already, e.g. in regard to projections. ↩︎

  2. How could I forget!? :sweat_smile: ↩︎

4 Likes

According to the announcement post of the Keyword Generics Initiative, fallibility is on their mind as well, and could result in new try or throws keywords. Unfortunately, keyword generics seem to be a long way off, if they ever come to pass.

const is another one, and one of the motivations for the keyword generics initiative. Interestingly, it is more a "subtraction" of capabilities offered by the base set of rust, rather than an addition like async is. You might be interested in this blog post by Yoshua Wuyts, who us part of the initiative.

PS: I for one always enjoy to read these lengthy discussions veering into the theoretical. There is definitely no reason to respond negatively for sub-optimal examples of legitimate problems in the language...

1 Like

For this point, if the async only exists for the callback and the callback doesn't use the asynchrony, the appropriate way to discharge the asynchrony would be .now_or_never(), which just polls a single time without even setting up the waker infrastructure. And perhaps .poll_once().now_or_never() if you want to avoid previously discarding an unfinished task if it happens to await.

3 Likes

Okay, thanks, that is interesting to know.

Speaking of keywords there is also the unsafe keyword which may be modeled as an effect too! The effect of "potential UB". :grin:

Content warning: modeling Rust's unsafe in Koka

Just for demonstration, we could model Rust's idea of "unsafe" as an effect danger in Koka:

effect danger 
  val danger-dummy : () // needed for syntactic reasons only

Then we could also implement an equivalent of Rust's unsafe { … } blocks like:

fun i-know-what-i-am-doing( action : () -> <danger|e> t ) : e t
  with val danger-dummy = ()
  action()

Finally we could define an unsafe fn like this (don't worry, I'll get back to Rust code in a sec):

fun foo( x : t ) : danger t
  x

Here, writing danger t instead of t corresponds to marking the function as unsafe in Rust.

And now we can write a main function in Koka, that looks like this:

fun main()
  with i-know-what-i-am-doing
  val u = [1, 2, 3];
  val v = map(u, fn(x) foo(2 * x))
  println(show-list(v, show))

Here, with i-know-what-i-am-doing corresponds to Rust's unsafe { … }. If we omit it, then the program won't compile.

We can see that Koka allows calling the unsafe/dangerous function in the closure passed to map. Thus map will "forward" the handling of unsafe/danger. It acts transparent in that matter.


Okay, so much about Koka, let's get back to Rust.


Interestingly, the previously cited announcement post regarding "keyword generics" speaks about unsafe Rust in a side note at the end:

We sometimes joke that Rust is actually 3-5 languages in a trenchcoat. Between const rust, fallible rust, async rust, unsafe rust - it can be easy for common APIs to only be available in one variant of the language, but not in others.

Do closure boundaries act transparent in regard to unsafe? Surprisingly (or unsurprisingly?) they do:

/// # Safety
///
/// * You need to know what you're doing.
pub unsafe fn foo<T>(x: T) -> T { x }

fn main() {
    // SAFETY: I know what I'm doing.
    let u: Vec<i32> = unsafe {
        let v = foo(vec![1, 2, 3]);
        v.into_iter().map(|x| {
            foo(2 * x) // no extra `unsafe` keyword needed here
        }).collect()
    };
    assert_eq!(u, vec![2, 4, 6]);
}

Edit: Admittingly, this ony works because unsafe { … } simply operates on the syntactic level, and not because map or closures do something magic in regard to effects here.

Just to remember, we won't be able to use .await across such a boundary:

pub fn unasync<T>(x: T) -> T::Output
where
    T: std::future::Future,
{
    futures::future::FutureExt::now_or_never(x).unwrap()
}

/// This method is `async`.
pub async fn bar<T>(x: T) -> T { x }

fn main() {
    let a: Vec<i32> = unasync(async {
        let v = bar(vec![1, 2, 3]).await;
        v.into_iter().map(|x| {
            // This does not work:
            bar(2 * x).await
        }).collect()
    });
    assert_eq!(a, vec![2, 4, 6]);
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0728]: `await` is only allowed inside `async` functions and blocks
  --> src/main.rs:16:24
   |
14 |         v.into_iter().map(|x| {
   |                           --- this is not `async`
15 |             // This does not work:
16 |             bar(2 * x).await
   |                        ^^^^^ only allowed inside `async` functions and blocks

For more information about this error, try `rustc --explain E0728`.
error: could not compile `playground` (bin "playground") due to previous error

To solve this, we need something like block_on (or in this case something that uses now_or_never to "unasync" an inner async block) inside the closure. (Full Playground with both examples).

So summarizing, unsafe could be considered to be an effect as well. This is merely a theoretic observation though without any real life impact: As shown in the above Playground, Rust acts different regarding unsafe when compared to async or fallibility (Result<T, E>, Try, etc). But I thought it is curious nonetheless.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.