Anouncing async_executors, a building block for executor agnostic libraries

I just released async_executors 0.1.0 and even wrote a blog post about it:

https://blog.wnut.pw/2020/02/25/anouncing-async_executors-a-building-block-for-executor-agnostic-libraries/

5 Likes

Nice! I agree that runtime agnosticism is a desirable library property :slightly_smiling_face:

A few remarks:

  • pub struct TokioCt
    {
        // It's paramount that this is !Send, currently thanks to this Rc, so don't remove that.
    	// See the impl of LocalSpawn as for why.
        //
    	pub(crate) exec  : Rc<RefCell< Runtime >> ,
        pub(crate) handle: TokioRtHandle          ,
    }
    

    If you want to make TokioCt be !Send, then I suggest you use a specific _not_send: PhantomData<Rc<()>> field, just to avoid mixing implementation features of the exec field with properties (or lack thereof) of the TokioCt struct. That being said, kudos for using a static assertion to guard against that property failing :+1:

  • Regarding unwind "safety", since not many people tackle this issue, it would be great if you could say that the naming of this property is not great: everywhere in the Rust ecosystem safety refers to memory safety / soundness and undefined behavior. However, since the AssertUnwindSafe does not require unsafe to use, this means that this does not apply to unwind "safety". In Rust, everything must be unwind safe, meaning that it should not be possible to witness Undefined Behavior from catching a panic, no matter how "absurd" the AssertUnwindSafe usage might be. What UnwindSafe & co. refer to is more of a lint, about unwind robustness:

    Should some data structure maintain a code-logic-wise meaningful state (hands waving) if a panic occurred when in the middle of processing it?

    • If the data structure was not being mutated, then surely so. And in Rust we know that something is immutable if behind a shared reference (&_) without any shared / aliased / interior mutability (i.e., an UnsafeCell) inside. This property is expressed in Rust with: auto trait RefUnwindSafe {}, impl !RefUnwindSafe for UnsafeCell<_> {} and impl UnwindSafe for &'_ impl RefUnwindSafe {}

    • Otherwise (i.e., interior mutability or when using an exclusive &mut reference whose usage may entail mutation of the pointee), things by default are !UnwindSafe, ...

    • ... leading to one having to explicitely whitelist / opt-into UnwindSafe-ty either through impls if in the standard library, or through usage of the AssertUnwindSafe wrapper that the standard library offers, if they want to express that their very particular type is "robust to unwinding", however they may interpret that.

    So, for instance, by-shared-reference mutability (interior mutability) is not that much more special than by-exclusive-reference mutability, regarding UnwindSafety. By default:

    • &mut _ : !UnwindSafe,

    • & (impl !RefUnwindSafe) : !UnwindSafe

    What diggsy has pointed out, is that these traits are even more wavy-handed than expected! Indeed, since threads act as panic / unwind boundaries, then when joining over some thread there is an implicit catch_unwind going on (semantic-wise, at least), which leads to one being able to circumvent the UnwindSafe limitation when the closure is Send. Indeed, in that case one can use a custom alternative implementation of catch_unwind based on a scoped thread: Playground

    • This, semantically, could be interpreted as "Send => UnwindSafe" (and thus "Sync => RefUnwindSafe"), but I wouldn't say that, at least not in an ironclad manner, given how precise the semantics of Send & Sync are, contrary to UnwindSafe's.

    My personal take on this is to rely more on documentation than on this trait when wanting to deal with / express robustness to panics, or lack thereof.

  • Finally, regarding the soundness of your "Executor : !Send => we can disregard the Send bound on the futures it spawns" is only valid because of implementation knowledge of tokio's local executor. That is, if tokio's local executor still requires Send to spawn a future, then it would be legal (despite non-sensical design-wise) for it to spawn the futures in a new thread. Because of that, that line is only sound if you pin-point the dependency over tokio with "=x.y.z" (that is, you don't even support patch updates since those would be allowed to decide to make the local executor spawn futures in another thread, since it wouldn;t break the public API contract of it requiring Send futures), after having personally audited what the version x.y.z odes exactly :grimacing: You should talk to ::tokio maintainers for them to offer an API that does not require this Send bound for the "local executor" case (by the way, could LocalSet suit you? I'm guessing it doesn't, but I think it is a direction worth exploring...)

  • Bonus: you can have a generic method on a trait object by using that generic method as sugar for the wrapping needed by the non generic / object-safe method:

    #[cfg(FALSE)]
    // Objective: make this object safe while remaining generic:
    trait Foo {
        fn run (&self, f: impl Fn());
    }
    
    // Solution:
    trait Foo : FooExt {
        fn impl_run (&self, f: impl Fn())
        where
            Self : Sized,
        ;
    }
    trait FooExt {
        fn run_object_safe (&self, f: &'_ dyn Fn())
        ;
    }
    impl<T : Foo> FooExt for T {
        fn run_object_safe (&self, f: &'_ dyn Fn())
        {
            self.impl_run(f)
        }
    }
    
    impl<'lt> dyn Foo + 'lt {
        fn run (&self, f: impl Fn())
        {
            self.run_object_safe(&f);
        }
    }
    
    
2 Likes

Thanks a lot for the review. It's greatly appreciated. I'm going to have to take this in piecemeal :wink: .

I'm currently trying to figure out what diggsey is saying, so I'll focus on that. I don't quite understand this reasoning:

The way I understand it, UnwindSafe and it's warning sign AssertUnwindSafe only do anything for an explicit catch_unwind. They don't help for:

  • destructors being run while unwinding (eg. Drop does not have a Self: UnwindSafe bound)
  • mut access through globals and TLS (out of band with regards to stack frames)
  • only unwinding the thread on which the panic happened (async-std actually tears down all involved threads)

These ways of creating unwind safety issues have always been there, async executors or not.

So in that limited field of applicability, it seems a separation is made between closures that can never introduce any unwind safety issues because there is simply no shared mutability of any sort present. The compiler can guarantee that. And then there is anything with shared mutability.

AFAICT Send and Sync are completely unrelated and orthogonal to this. Types can have interior mutability and be Sync or not be Sync.

It seems to me that the link between them comes from the RFC, which reads as if it is assumed that Send types that have interior mutability will poison. However that is neither guaranteed, nor the case with popular and existing types like parking_lot::Mutex today.

The way I feel is that by doing catch_unwind(AssertUnwindSafe(user_code)); one takes away the warning sign, in which case I argue that it should be well documented in the API docs that the burden is on the user to double check unwind safety.

If one takes the single threaded executor for example (no thread boundary, only a catch_unwind introduced by tokio). Yes, the future is required to be Send, but nothing keeps you from holding a Send futures::lock::Mutex in several places on the same thread, but that thread will no longer unwind.

You can no longer shoot yourself in the foot with RefCell and friends, that's true. futures::lock::Mutex happens to claim that it is UnwindSafe, but it's documentation doesn't say anything about poisoning.

What I don't understand in what diggsey is saying is:

  • the AssertUnwindSafe is fine, because it requires Send (doesn't disable shared mutability)
  • you cannot statically guarantee unwind safety of Sync types, so just auto implement RefUnwindSafe for all of them. => I think Sync is orthogonal, but if it were true I would say especially don't impl RefUnwindSafe.

Ok, I hope I'm not saying to many stupid things, but I really would like to understand it.

1 Like

I have clarified the term unwind safety as you suggested and added a PhantomData. I feel slightly mixed about the last one as it reminds me of "This page intentionally left blank" which always irritates me beyond reason, but I agree that I don't like the side-effect style it had until now...

This is a bit of a bummer. I completely forgot about that API, but that's probably what I should be using for TokioCt. Wish I saw it before I released. It will be a breaking change, but it will bring the API in line with futures::executor::LocalPool.

O wow. My eyes keep glazing over while I try to understand that trick. I will have to experiment with it I think before I will really grasp it. If by any chance I manage to make but a single SpawnHandle trait that is OS, it would make the API much nicer. I haven't really figured if that will work eventually because there is 2 generics in the trait method. Will try!

I keep being amazed by how you come up with stuff like this...

@Yandros

Ok, the LocalSet based implementation seems to work fine, without changing the API, that's awesome.

I'm playing around with the generic object safe traits... I wondered about something. What role does the lifetime on impl dyn Foo have?

I tried removing it, but nothing seems to break, even when implementing the trait for &Bar and using a &Bar that isn't 'static...

trait Trait {}

impl dyn Trait {
    fn example (self: &'_ dyn Trait)
    {}
}

is a short example showing how the lifetime elision within trait objects can lead to obscure errors.

In the same fashion that & is a type constructor that requires both a type and a lifetime to become an actual type (&'_ _), dyn Trait is not really a type, dyn Trait + '_ is. But in the same fashion that &T hides elides that lifetime parameter, dyn Trait is also sugar for a type with an elided lifetime.

And it so happens that the rules for the lifetime elision of trait objects are quite different than the rules for the lifetime elision of other types. The main rules are:

  • &'lt [mut] dyn Trait unsugars to &'lt [mut] (dyn Trait + 'lt)

  • other usages of dyn Trait unsugar to dyn Trait + 'static

  • to opt-out of that behavior and use the less surprising behavior, one needs to avoid implicitly eliding the lifetime (if you want to elide a lifetime for ergonomics, be explicit about it using '_):

    dyn Trait + '_ uses the classic "inference" / existential lifetime that other types with elided lifetime parameters do,

    • &['lt] [mut] (dyn Trait + 'static) or any other concrete lifetime can also be explicitely fed, although the cases where that is needed are quite rare.

So, back to your question, why is it better to use impl<'lt> dyn Trait + 'lt { (or, equivalently, impl dyn Trait + '_ {) rather than impl dyn Trait {?

Here is an example:

trait Trait {}

impl dyn Trait {
    fn method (&self) {}
}

fn check (obj: &dyn Trait)
{
    obj.method(); // Error!
}

indeed, the above unsugars to:

trait Trait {}

impl dyn Trait + 'static {
    fn method<'lt> (self: &'lt (dyn Trait + 'static)) {}
}

fn check<'lt> (obj: &'lt (dyn Trait + 'lt))
{
    obj.method(); // Error, the inner `'lt` may not be `: 'static`
}
1 Like

Thanks for clarifying. I had been reading up a bit in the reference in the mean time, but I won't aim to internalize all that for the moment. few.

I'm just thinking it probably won't work for a SpawnHandle trait, since the method that requires Self: Sized ends up receiving a &dyn Future from the object safe method, but we need to take ownership of the future to give it to the executor...

I had used &dyn Fn() as a simple example, but you can replace Fn with Future and &dyn with Pin<Box<dyn; and when calling it instead of &f you need to Box::pin(fut). This "technique" is just ergonomic sugar: object safety still requires that the underlying method of your trait object not be generic, hence requiring this Pin<Box<dyn Future...>>>, but then since you can have inherent methods on the trait object type, you can have a proxy method that calls Box::pin for you.

Hmm, but what I have now is:

pub trait SpawnHandleOs<Out: 'static + Send>
{
   /// Spawn a future and return a JoinHandle that can be awaited for the output of the future.
   //
   fn spawn_handle_os( &self, future: Pin<Box< dyn Future<Output = Out> + Send >> ) -> Result<JoinHandle<Out>, SpawnError>;
}

Which is object safe, without the work around. The downside is the future needs to be boxed, and I still have still have the Out type in both parameter and return type...

So I just put that on the trait itself, which can be inconvenient for client code if they need an executor that can spawn futures with different Out types.

Since the Out param I don't think I can get rid of it, I thought maybe the technique you showed would allow getting rid of the boxing, but it seems that only works if you only need to borrow the parameter.

The not object safe version looks like this:

pub trait SpawnHandle
{
   /// Spawn a future and return a [JoinHandle] that can be awaited for the output of the future.
   //
   fn spawn_handle<Fut, Out>( &self, future: Fut ) -> Result<JoinHandle<Out>, SpawnError>

      where Fut: Future<Output = Out> + 'static + Send,
             Out: 'static + Send
   ;
}

At first I made both because I thought that the boxing would displease people, but from the benchmarks I ran, the overhead is really neglectable. So only the inconvenience of the Out type on the trait itself remains. After all I'm not so convinced it's worth keeping both, especially since we need the Local versions as well, which means 4 traits.

What's worse, if this ever needs to support no_std it'd better take a FutureObj but it's a bit annoying to tell users to create that manually which would invite an extension trait to take an impl Future to hide that FutureObj, but then there is so many traits, ahh.

Yes, the Out parameter needs to be a parameter of the trait, there is no way going around that.

But my suggested "sugar hack" is still applicable, here is how:

pub
trait SpawnHandle : Sized {
    /// Spawn a future and return a [JoinHandle] that can be awaited for the output of the future.
    fn spawn_handle<Out, Fut> (
        self: &'_ Self,
        future: Fut,
    ) -> Result<
            JoinHandle<Out>,
            SpawnError,
        >
    where
        Out : Send + 'static,
        Fut : Future<Output = Out> + Send + 'static,
    ;
}

pub
trait SpawnHandleOs<Out>
where
    Out : Send + 'static,
{
    fn spawn_handle_os (
        self: &'_ Self,
        fut: Pin<Box<dyn Future<Output = Out> + Send + 'static>>,
    ) -> Result<
            JoinHandle<Out>,
            SpawnError,
        >
    ;
}

impl<T, Out> SpawnHandleOs<Out> for T
where
    T : SpawnHandle,
    Out : Send + 'static,
{
    fn spawn_handle_os (
        self: &'_ Self,
        fut: Pin<Box<dyn Future<Output = Out> + Send + 'static>>,
    ) -> Result<
            JoinHandle<Out>,
            SpawnError,
        >
    {
        self.spawn_handle(fut)
    }
}

pub
type DynSpawnHandle<Out> = dyn SpawnHandleOs<Out> + 'static;

impl<Out> DynSpawnHandle<Out>
where
    Out : Send + 'static,
{
    fn spawn_handle (
        self: &'_ Self,
        fut: impl Future<Output = Out> + Send + 'static,
    ) -> Result<
            JoinHandle<Out>,
            SpawnError,
        >
    {
        self.spawn_handle_os(Box::pin(fut))
    }
}

fn check (executor: &'_ DynSpawnHandle<()>)
{
    let _ = executor.spawn_handle(async {});
}
  • Playground

  • the reason I call it a sugar hack is that indeed the boxing happens nevertheless, we just hide it from the caller for the sake of ergonomics. The only way to truly circumvent that is by using the Pin<&'lt mut dyn Fut...> as the non-boxed trait object for futures, but that means that all the stack-allocated futures will need to be spawned and run before any is dropped, so I don't know how usable that would be in practice.

Yep, many traits, but still an interesting idea. Maybe then a macro could be used to get an ergonomic usage:

spawn_handle!(executor, async { ... });
// unsugaring to:
let fut = async { ... };
pin_mut!(fut);
executor.spawn_handle_os(fut as Pin<&'_ mut (dyn Future...)>);
  • (but, of course, with the same problem regarding lifetimes)