Rocket: spawning a child thread that has access to state that outlives the parent route

Hi,

I'm building an http microservice using Rocket where users can submit jobs using a particular end point and later query the results of that job which are stored in state. That job is likely to take 5+ minutes which is too long to wait for a response. So i want to return some kind of 'job accepted' response and then let the user query another endpoint to check for when the job is done.

I've done this before in python where the gc takes care of memory management, so my approach is to basically spawn a child thread which will do the processing while the parent route just returns.

i.e. something like (in pseudo code)

#[post("/create/<job_name>", format = "application/json", data = "<job_data>")]
pub async fn create_job(job_name: &str, data:  Data<'_>, some_state: &State<Arc<SomeStruct>){
   let parsed_data = parse_data(data);
   tokio::spawn(async move {
       some_state.mutate(parsed_data, job_name);
   })
}

The issue here is that job_name and some_state currently have anonymous lifetimes where as tokio::spawn requires a 'static lifetime.

Changing job_name and some_state to have 'static lifetimes doesn't work because they then have lifetimes longer than `__r (the lifetime of the route itself) which isn't static so lifetime bounds aren't satisfied.

So is there a way to do this?

I'm not familiar with Rocket that much myself, but just by looking at the code several mistakes pop up:

  1. You're parsing data outside of the spawned thread, which makes no sense as you're not processing any issues with it. This will only make your response take longer, without any benefit. Move the let parsed data = ... line inside the thread and handle everything there directly.

  2. If you're moving job_name into a new thread, it has to be possible to move it in the first place - that is, the thread has to be able to 'capture' it and take the ownership of it. A reference is not capturable and it gets deallocated at the end of the function. Therefore, either turn it into a string, or pass a string as an argument in the first place - which will then be captured by the new thread.

  3. The same problem applies to your State, albeit with a new complication. You're trying to pass a reference that is about to be dropped in the function to a new thread that is about to start working with it. Which is absolutely unnecessary, as your State contains the Arc which is the actual thing that needs to be passed (if you're not sure why, refer to the docs or the book). Solution: pass a new clone()-d Arc inside of the state - which creates a new reference, shareable across the threads, to the SomeStruct you have inside of it and then do what you need to do with it later.

Points 1&2 are useful so thanks for that.

But on point 3 creating a clone of the state doesn't work, this creates a copy. But the whole point of the state in rocket is that this is shared by all routes. So creating a clone and modifying it means that I'm mutating the state of a copy which is then discarded as the child thread ends. No other route could then access this mutated state and being able to mutate the State is key here.

I should add that SomeStruct in this case is actually a DashMap so I can (safely) mutate it as DashMap is a concurrent HashMap.

Glad to help - you seem to be confusing the process of cloning an owned object such as String (which does allocate a completely new object on the heap), with a process of cloning an Arc, though.

An Atomically Reference Counted pointer (which is all an Arc is) doesn't create any new copies and doesn't allocate any new memory. What it does is create a copy of the reference (the location in memory) of the data - which you can then use to modify the data that you wish to mutate in another thread.

Read the docs - and try playing around with something simpler, if that's still hard to grasp.

I see, I hadn't realised that cloning an Arc created a reference but given that an Arc allows sharing across threads that makes perfect sense. It does, however, not solve my issue, in fact it makes no difference so maybe I'm doing something wrong.

I've put together an minimal reproducible example to demonstrate this a bit better.

My Cargo.toml dependancies

[dependencies]
rocket = {version = "0.5.0-rc.1",  features = ["json"]}
dashmap = "4.0.2"
tokio = { version = "1.7.1", features = ["full"] }

and then my main.rs

#[macro_use]
extern crate rocket;

use dashmap::DashMap;
use std::sync::Arc;
use rocket::State;


pub struct Index{
    pub dash_map: DashMap<String, f64>
}

#[post("/add/<new_entry>")]
pub async fn add_entry(new_entry: &str, state: &'static State<Arc<Index>>) {
    let new_entry_string = new_entry.to_string();
    let cloned_state = state.clone();
    tokio::spawn(async move {
        cloned_state.dash_map.insert(new_entry_string, 1f64);
    }
    );
}

#[get("/see_entries")]
pub async fn see_entries(state: &State<Arc<Index>>) -> Result<String, std::io::Error>{
    Ok(format!("{:?}", state.dash_map))
}

#[launch]
fn rocket() -> _ {
    rocket::build()
        .manage(Arc::new(Index {
            dash_map: DashMap::new(),
        }))
        .mount(
            "/",
            routes![add_entry, see_entries],
        )
}

I'm starting to suspect I'm rubbing up against some limitations of rocket or I'm doing something in completely the wrong way but I'm learning a lot (!!). Thanks once again for your help!

If you could mention the issues that you're getting, that would be great - it's a bit hard to grasp intuitively what it is exactly that is not working if you don't say anything about the kind of problems that you're having.

From what I remember, though, Rocket currently is still working in synchronous mode only (handling async functions and responses isn't fully supported). This thread seems to confirm it.

Try to remove the async prefix before the functions and see if it helps, if not - at least post what kind of errors is the compiler is giving you and we'll take it from there.

P.S. 'static lifetime for the reference to the state isn't necessary, in fact it might be giving further issues. It's the clone of the Arc that matters, which has to be clone-d to give the other thread a chance to work with it after the original reference gets dropped once your function returns.

The 'static lifetime is included because tokio::spawn has a 'static lifetime for obvious reasons.

Compiling with this 'static lifetime gives the following error

error[E0759]: `__req` has lifetime `'__r` but it needs to satisfy a `'static` lifetime requirement
  --> src/main.rs:14:48
   |
13 | #[post("/add/<new_entry>")]
   | --------------------------- this data with lifetime `'__r`...
14 | pub async fn add_entry(new_entry: &str, state: &'static State<Arc<Index>>) {
   |                                                ^ ...is captured here...
   |
note: ...and is required to live as long as `'static` here
  --> src/main.rs:14:41
   |
14 | pub async fn add_entry(new_entry: &str, state: &'static State<Arc<Index>>) {
   |                                         ^^^^^

For more information about this error, try `rustc --explain E0759`.
error: could not compile `rocket-arc` due to previous error

and then without the 'static lifetime the example gives this error

error[E0759]: `state` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
  --> src/main.rs:14:41
   |
14 | pub async fn add_entry(new_entry: &str, state: &State<Arc<Index>>) {
   |                                         ^^^^^  ------------------ this data with an anonymous lifetime `'_`...
   |                                         |
   |                                         ...is captured here...
...
17 |     tokio::spawn(async move {
   |     ------------ ...and is required to live as long as `'static` here

For more information about this error, try `rustc --explain E0759`.
error: could not compile `rocket-arc` due to previous error

The thread you linked about rocket is from over a year ago when rocket didn't support async, this is no longer the case with v0.5.0-rc.1 which was released on 9th June this year. The examples in the guide also regularly use the async routes

Correct, and in the first error it tells you exactly what the problem is - the function refers to an anonymous lifetime 'r which lasts until the end of the function, and this is the actual lifetime of the argument that gets passed to it - not the 'static one, which you've specified. Remove it and keep reading.

In the second case it appears you're still working with the State itself, instead of the Arc, a quick look at the docs shows that the State struct that wraps all the data that you pass to it has an inner() method that exposes the reference to the data inside of it - namely, the Arc that you want.

Try this:

let data = state.inner().clone();

before you spawn your thread, and inside of it

data.dash_map.insert(new_entry_string, 1f64);

Or whatever you have there.

P.S. As mentioned earlier, I've never used Rocket myself - the fact that it's based on nightly features and is relatively slow on updates has always kept me away. Actix Web seems to be the current de-facto standard in the community, used in a relatively popular Zero to Production guide as well.

1 Like

Amazing, using the inner() method get's me access to the Arc. In fact the docs you linked make an explicit reference to only using this method when you want to extend the lifetime which is exactly what I wanted to do - so mea cupla there.

On Actix Web my understanding when I was picking a framework was that it was riddled with unsafe code and when the issue was raised the maintainer's response was petulant but it looks like that's now been mostly addressed. So i'll have a look at a Actix Web too.

Either way, thanks for your help and patience! I'm new to non-garbage collected languages so lifetimes are still a challenge!

1 Like

Glad to help, as always. Looks we're both are a bit "rusty" (got it?) when it comes to unknown frameworks - Actix Web has since been rewritten several times to reduce the amount of unsafe code to as close to 0 as it can be. The original author (the maintainer in question) is also no longer in charge of the project, as per his own desire - all the backlash really made him feel uneasy about the whole thing, and the last contributors are making a steady progress towards the release of 4.0 - check it out.

Rust is pretty much unique in its approach to memory management, so don't worry - we've also been through the struggle. If you haven't gone through the official book, make sure to start there - and afterwards, if you're willing to invest some more time and $ in your Rusty skills, I'd highly recommend to grab the book of Jon - one of the top educators for the language currently. He's also got an awesome channel on YouTube, covering a ton of intermediate-to-advanced topics in Rust, so check that out too.

All the best - keep on learning.

2 Likes

Note that this is no longer the case in 0.5, which is the version I would recommend using despite only being a "-rc" release for now.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.