Ownership/lifetime problem in nested async code with references

I'm trying to wrangle some async code in a testing suite, and I've stumbled upon an ownership problem that I can't explain nor fix.

Here's what I think is probably a minimal example of my setup:

use futures::Future;

struct Container {
    data: String // specifically, something which is !Copy
}

impl Container {
    async fn get_immutable_property(&self) -> bool {
        // something async here... but which does NOT require &mut self
        true
    }

    async fn do_mutable_work(&mut self) -> Result<(), ()> {
        //...do things here that DOES require &mut self...
        Ok(())
    }

    pub async fn do_async_work<'a, Fut: Future<Output = bool> + 'a>(
        &'a mut self,
        get_condition: impl Fn(&'a Container) -> Fut,
    ) -> Result<(), ()> {
        while !get_condition(self).await {
            self.do_mutable_work().await?;
        }
        Ok(())
    }
}

// elsewhere...
#[tokio::main]
async fn main() -> Result<(), ()> {
    let mut container = Container { data: String::new() };
    // lifetime problems here
    container.do_async_work(|container| async move {
        container.get_immutable_property().await
    }).await
}

Playground link.

As you can see in the playground, this causes an ownership error when attempting run do_mutable_work() which takes &mut self.

This error seems to be caused by the lifetime annotations on do_async_work(); removing them makes the ownership conflict disappear, but instead causes lifetime issues in the closure passed to do_async_work() inside main: playground link.

Now, I know I could "trivially" fix this by just strongarming the safety using, probably, some Arc<Mutex<Container>> construction and, as long as I don't have deadlocks, that will work and avoid all ownership and lifetime issues for my case. However, that will require a decent amount of refactoring, and intuitively, it feels like this code should be correct without requiring reference counting and synchronisation: as a human,

  • I know that my &Container will outlive my future since I call the future from inside a method on Container, where it is awaited, and thus self is guaranteed to outlive it;
  • I also know that after calling get_condition(self), I again immediately await it, drop the return value (I've even tried silly refactorings just to make sure the bool is dropped before the loop body), and only then call self.do_mutable_work(), thus it shouldn't violate borrow rules;

However I don't seem to be able to convince the compiler of both of these facts at once.

So my question is two-fold:

  1. Is my human intuition correct, and is this code safe and sound and being unnecessarily rejected by the compiler? And
  2. If so, is there some way I can encode this into the type system, without unsafe, thus avoiding having to switch everything Arc<Mutex<>> for this problem?

Of course if my own understanding is flawed and the code is NOT safe as written, then a Mutex would be the correct approach, but I really don't want to add it if it's not actually necessary and is possible to avoid.

Hi and welcome to this forum! Sorry for the delay, your post was stuck in our spam filter.

Rules of thumb:

  • Lifetimes on &mut self are trouble, don't use them.
  • If you need a lifetime for an Fn argument, use for<'a> Fn(&'a …) syntax instead.

However, here you need a lifetime for an async function, which syntactically doesn't work with for<'a>, and needs this workaround:

In order for the code to work, get_condition(self) must be implicitly reborrowing self as a shorter-lived &Container. If it didn't, then you would not be able to do anything with self again, because mutable references are exclusive — non-copiable.

So, get_condition is given a &'reborrow Container whose lifetime is definitely shorter than the lifetime parameter 'a — all parameters of all kinds must outlive the entire function call. Then, you get an error because of these conflicting: the borrow must be 'a, but it also must be short.

My favorite solution to this is the async_fn_traits library:

use async_fn_traits::AsyncFn1;

...

    pub async fn do_async_work(
        &mut self,
        get_condition: impl for<'a> AsyncFn1<&'a Container, Output = bool>,
    ) -> Result<(), ()> {
        while !get_condition(self).await {
            self.do_mutable_work().await?;
        }
        Ok(())
    }

You'll also need to replace the closure with an async fn, because || async { ... doesn't tie the lifetimes together in a way that works for this situation. (This is why there's an unstable dedicated "async closures" feature.)

All together:

use async_fn_traits::AsyncFn1;

struct Container {
    data: String, // specifically, something which is !Copy
}

impl Container {
    async fn get_immutable_property(&self) -> bool {
        // something async here... but which does NOT require &mut self
        true
    }

    async fn do_mutable_work(&mut self) -> Result<(), ()> {
        //...do things here that DOES require &mut self...
        Ok(())
    }

    pub async fn do_async_work(
        &mut self,
        get_condition: impl for<'a> AsyncFn1<&'a Container, Output = bool>,
    ) -> Result<(), ()> {
        while !get_condition(self).await {
            self.do_mutable_work().await?;
        }
        Ok(())
    }
}

// elsewhere...
#[tokio::main]
async fn main() -> Result<(), ()> {
    let mut container = Container {
        data: String::new(),
    };
    // lifetime problems here
    container.do_async_work(helper).await
}

async fn helper(container: &Container) -> bool {
    container.get_immutable_property().await
}
3 Likes

Thanks a lot for the detailed explanation! This effectively solved the lifetime issue and works almost perfectly for me.

Just one stumbling block: in some cases I was capturing part of the environment in my closures, e.g. like this:

async fn main() -> Result<(), ()> {
    let mut container = Container { data: String::new() };
    // lifetime problems here
    let index: u32 = non_idempotent_calculation();
    container.do_async_work(|container| async move {
        container.get_immutable_property(index).await
    }).await
}

By switching to an async fn, this is no longer possible. I'm working around that just by passing an extra context parameter into the (now) AsyncFn2, but as I briefly mentioned at the start this is part of a testing framework so as different values need to be captured in different tests, it could get quite inelegant fast. Is there a better way around this?
I'd be happy restricting the context/captures only to types which are Copy, which should sidestem any potential lifetime or ownership issues.

I'd also be curious if you could explain a bit more why async fn works for the lifetimes here while || async {} doesn't, or point me towards a resource describing this in more detail, out of curiosity.

(And incidentally would an async closure avoid this issue, if I were willing to use nightly features?)

When you write a fn or async fn, the function signature describes exactly how the lifetimes in the inputs and outputs relate to each other. There's no comparable mechanism in closures, so the compiler applies default rules that might not be what you want. Most closures benefit from being passed to a function taking a callback type that specifies what is needed, but AsyncFn1 is too indirect for that to help.

I don't have a solution for you. I tried to function-ify as much as possible, but it still did not compile:

#[tokio::main]
async fn main() -> Result<(), ()> {
    let captured = String::new();
    let mut container = Container {
        data: String::new(),
    };
    container.do_async_work(mkclosure(captured)).await
}

fn mkclosure(captured: String) -> impl for<'a> AsyncFn1<&'a Container, Output = bool> {
    move |container: &Container| helper(container, captured.clone())
}

fn helper<'a>(container: &'a Container, _captured: String) -> impl Future<Output = bool> + 'a {
    async move { container.get_immutable_property().await }
}
error: lifetime may not live long enough
  --> src/lib.rs:42:34
   |
42 |     move |container: &Container| helper(container, captured.clone())
   |                      -         - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ returning this value requires that `'1` must outlive `'2`
   |                      |         |
   |                      |         return type of closure is impl Future<Output = bool> + '2
   |                      let's call the lifetime of this reference `'1`

My personal solution to problems in this vain is to avoid writing async callbacks that take references. Async callbacks that take owned values don't have this problem.

In the future we might get compiler features to be able to specify the lifetimes better: 3216-closure-lifetime-binder - The Rust RFC Book

2 Likes

I see, thanks for the clear explanation! Yeah, I'll have to consider whether to pass around an ugly Context and define async fns everywhere, or refactor everything to pass around an Arc or so. Cheers for the comprehensive answers.

Hi everyone,

I have been following the discussion here with interest and I was thinking about the restriction with the FN traits for async functions.

My current understanding is:

According to here an Fn trait like for example Fn(usize) -> impl Future<Output = usize> would be a HKT and in the current state of rust the compiler devs are not yet sure about whether an implementation for HKTs should take place and if so, what it should look like. Is that correct?

Regards
keks

It would be something like

trait FnIsh<Args> {
    FutureOutput<Args>;
    FnOutput<Args>: Future<Output = Self::FutureOutput<Args>>;
    fn call_ish(args: Args) -> Self::FnOutput<Args>;
}

With a bunch of complications around lifetimes, HRTBs (where for<'a>... as illustrated above), HRTBs over types which don't exist yet (where for<T> ...), and so on.

These are the things that RPITIT and AFIT and so on are aimed at.

There are other possible advancements that could help (for<'a where 'b: 'a> ...).

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.