Adding fn item to vec from inside thread::scope

I'm trying to implement a simple work-stealing task scheduler, and I'm having some problems with fn items.

Here's a minimal example that recreates the situation:

use std::thread;

struct Context<Function, Data> {
    global_task_queue: Vec<Task<Function, Data>>,
}

struct Task<Function, Data> {
    function: Function,
    data: Data,
}

impl<'scope, 'env: 'scope, Function: Send + 'static> Context<Function, &'env i32>
where
    Function: Fn(&'env i32),
{
    fn run(self) {
        let a = vec![1, 2, 3];
        let mut x = 0;

        thread::scope(|s| {
            let new_task = Task { function: func, data: &x };
            self.global_task_queue.push(new_task);

            s.spawn(|| {
                println!("hello from the first scoped thread");
                dbg!(&a);
            });
        });
    }
}

fn func(_number: &i32) {}

fn main() {
    let b = 2;
    let task = Task { function: func, data: &b };
    let context = Context {
        global_task_queue: vec![task],
    };
    context.run();
}

(Playground link)

I've got a Task struct that represents a unit of work, which contains a function to do the work and som input data to operate on. I want to run these tasks in separate threads, but I also want them to have an immutable reference to data located in the main thread (represented by x in the example). I'm using thread::scope for this, since it allows passing references with non-'static lifetimes to the threads.

The reason I'm storing the Tasks inside global_task_queue is because I'm actually using crossbeam::deque and it is a global work queue which the worker threads take tasks from when they need more work.

It doesn't compile, though:

error[E0308]: mismatched types
  --> src/main.rs:22:41
   |
12 | impl<'scope, 'env: 'scope, Function: Send + 'static> Context<Function, &'env i32>
   |                            -------- this type parameter
...
22 |             self.global_task_queue.push(new_task);
   |                                    ---- ^^^^^^^^ expected type parameter `Function`, found fn item
   |                                    |
   |                                    arguments to this function are incorrect
   |
   = note: expected struct `Task<Function, &'env i32>`
              found struct `Task<for<'a> fn(&'a i32) {func}, &{integer}>`
note: associated function defined here
  --> /rustc/d5a82bbd26e1ad8b7401f6a718a9c57c96905483/library/alloc/src/vec/mod.rs:1831:12

For more information about this error, try `rustc --explain E0308`.

Is there a more fitting way of representing the Tasks, or is there a way to make this work?

You probably don’t want to be generic over Function, but just prescribe a single dynamic function type such as a boxed trait object Box<dyn Fn…>; or a function pointer type in case your use-cases will never involve any captured data.

It’s not 100% clear to me what the role of Data is in your example code, and how it relates to Function, since you prescribe a Fn(&i32) function signature which doesn’t really relate to the type Data at all? Given the description in your post above, it does however sound like capturing a reference to some data in a boxed trait object could be sufficient, so you might be able to replace the whole Task type by something like Box<dyn FnOnce() + Send + 'lifetime>.

just prescribe a single dynamic function type such as a boxed trait object Box<dyn Fn…>

I considered doing that, but I wanted the tasks to be PartialEq + Debug and IIRC that doesn't work with dyn Fn.... Correct me if I'm wrong / it's possible to implement them myself (rather than #[derive(..)])!

It’s not 100% clear to me what the role of Data is in your example code, and how it relates to Function , since you prescribe a Fn(&i32) function signature which doesn’t really relate to the type Data at all?

I wanted the task to be generic over its input data, so in a few places I've got Function: Fn(Data) + .... I used Fn(&i32) in this example as I've got two different impl-blocks with different types for Data, it's not implemented for all possible types of Data.

Given the description in your post above, it does however sound like capturing a reference to some data in a boxed trait object could be sufficient, so you might be able to replace the whole Task type by something like Box<dyn FnOnce() + Send + 'lifetime> .

That sounds interesting! Am I interpreting you correctly that you mean I could use closures that capture references to the input Data as my tasks? I.e.

let x = 1;

let task = Box::new(|| {
    println!("doing work on reference {}", &x");
});

I'll give that a go, thanks for the suggestion!

Yes, that's the idea.

I've now tried to implement it like you suggested, and I'm running into some lifetime issues.

Updated code:

use std::thread;

struct Task<'a> {
    uid: u64, // Used to implement Eq and Debug
    function: Box<dyn FnMut() + Send + 'a>,
}

struct Context<'a> {
    tasks: Vec<Task<'a>>,
}

impl<'scope, 'env: 'scope> Context<'scope> {
    fn run(&'env mut self, mut a: Vec<i32>) {
        let x = 0;

        thread::scope(|s| {
            for n in a.iter_mut() {
                let task = Task {
                    uid: 0,
                    function: Box::new(move || {
                        dbg!(&n);
                    }),
                };
                self.tasks.push(task);
            }

            s.spawn(|| {
                println!("hello from the first scoped thread");
                dbg!(&x);
            });
        });
    }
}

fn main() {
    let mut context = Context { tasks: vec![] };
    let a = vec![1, 2, 3];
    context.run(a);
}

(Playground link)

I'm getting two lifetime errors:

error[E0597]: `a` does not live long enough
  --> src/main.rs:17:22
   |
12 | impl<'scope, 'env: 'scope> Context<'scope> {
   |      ------ lifetime `'scope` defined here
...
16 |         thread::scope(|s| {
   |                       --- value captured here
17 |             for n in a.iter_mut() {
   |                      ^ borrowed value does not live long enough
...
24 |                 self.tasks.push(task);
   |                 --------------------- argument requires that `a` is borrowed for `'scope`
...
32 |     }
   |     - `a` dropped here while still borrowed

error[E0597]: `context` does not live long enough
  --> src/main.rs:38:5
   |
38 |     context.run(a);
   |     ^^^^^^^^^^^^^^ borrowed value does not live long enough
39 | }
   | -
   | |
   | `context` dropped here while still borrowed
   | borrow might be used here, when `context` is dropped and runs the destructor for type `Context<'_>`

For more information about this error, try `rustc --explain E0597`.

What I don't understand is why it complains that a will be dropped after run executes. I thought thread::scope makes sure that all the threads have terminated before returning, which means that a won't be dropped until all the threads containing references to it are closed.

I've tried some variants where run takes references or takes ownership of the parameters, but all that does is shift the problem to the caller/callee. So I'm imagining that there's some underlying issue here I'm not understanding.

Am I missing some lifetime annotations to inform the compiler that this is okay?

You can read 'env: 'scope as "'env outlives 'scope". But also, 'scope is a lifetime within stuff being borrowed with lifetime 'env, so 'scope must outlive 'env. Thus the two lifetime parameters become equivalent, and invoking run is borrowing Context for the rest of its existence, which is almost entirely useless and causes weird errors.

But the deeper problem here is that if you want a task dyn FnMut() + Send + 'a to borrow something, then it must borrow something that outlives the lifetime of its validity 'a. The reason this isn't currently satisfied is that the data is borrowed from a: Vec<i32>, which is a local variable inside of run() and therefore is guaranteed to be invalidated before 'a ends. To illustrate what would be valid, your program compiles if I change the above section to take input data borrowed for 'a:

impl<'a> Context<'a> {
    fn run(&mut self, a: &'a mut [i32]) {

and change main() to

fn main() {
    let mut a = vec![1, 2, 3];
    let mut context = Context { tasks: vec![] };
    context.run(&mut a);
}

Now, the data is guaranteed to outlive the 'a lifetime of the Context, so the tasks which borrow that data are allowed to be put in the tasks vector of the Context which requires that they are valid for 'a.

But you may notice I swapped the let mut a = and let mut context = lines in main(). If you want a scheduler that allows individual tasks to borrow data that is shorter-lived than the scheduler, then that will require unsafe code to promise to the borrow checker that you're ensuring the tasks complete before returning and letting the borrow end — it can't reason about that on its own. (Scoped threads do this internally, and so does rayon with a thread pool that outlives the individual scopes.)

1 Like

I see, thank you for clarifying! This was very helpful.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.