Enforcing only Copy types are captured by a closure

This is a bit out of regular Rust usage, so I understand if some of this things are not possible to be expressed using Rust's type system.

I'm working on a "process" spawning library that will be used by Rust applications that compile to WebAssembly and run on a special WASM runtime. This runtime allows you to crate "processes" with a separate memory. I would like to allow spawning new "processes" just by passing a closure, but currently run into problems when a closure capture a non-copy type.

To better illustrate what I'm doing here is an example showing how code looks using my library:

fn main() {
    let m = vec![42];
    println!("Hello world {:?}", m);

    spawn(|| {
        println!("Hello world {:?}", m);
    });
}

The signature of spawn looks like this:

pub fn spawn<F>(f: F) where F: FnOnce() + Copy

I want this code to fail when compiling, because it's not valid in my runtime. Here is the output of it:

Hello world [42]
Hello world []

The other "process" prints out an empty array, because it has a completely different memory space than the main "process".

If the closure captures the vector m it should not be Copy, but why does it still compile?

Eventually I would like to add a send() function accepting Copy types that is able to move data to other processes. Capturing Copy types would be ok in this case.

Also, am I overlooking something else here that may cause me problems later down the road?

It's not capturing m by value, it's capturing a reference to it. &Vec<i32> is Copy, so the code compiles.

1 Like

Is there a way I could forbid reference types from being captured?

I do not believe so. Requiring 'static would block your example above, but not pointers to static memory.

Thanks! I assume the only way to work around this would be to create another trait (ProcessSend) and just implement it for all the types I know are safe to be captured.

But can I express in the type system that the closure implements this trait only if all captured values implement it?

You can use auto traits, but defining your own requires a nighly-only feature: https://doc.rust-lang.org/reference/special-types-and-traits.html#auto-traits.

Another potential issue is that your F itself could be a reference, since there's a blanket impl:

impl<'_, A, F> FnOnce<A> for &'_ F
where
    F: Fn<A> + ?Sized, 
1 Like

I'm doubtful this can ever be made sound. It depends on what "separate memory" really means. How does this work with statics? They don't need to be captured to be used in a closure, and they can hold references to the heap via Box::leak...

2 Likes

I also don't think I can make this sound, but for now I'm just aiming at covering the most basic use cases.

Static memory should be fine, as each process initialises the static part. But references through static memory to the heap would of course break stuff.

For now I have decided to go with the 'static route as @sfackler suggested:

pub fn spawn<F>(f: F)
where
    F: FnOnce() + Copy + 'static

It works fine for most of my examples.

Another approach I need to investigate would be to use auto traits and negative impls. The signature would look like this:

pub fn spawn<F>(f: F)
where
    F: FnOnce() + ProcessSend

and ProcessSend is defined for built in types as:

pub unsafe auto trait ProcessSend {}

impl<F> !ProcessSend for &F where F: FnOnce() {}

impl !ProcessSend for &i32 {}
impl !ProcessSend for &mut i32 {}
impl !ProcessSend for *const i32 {}
impl !ProcessSend for *mut i32 {}
....

The few tests I run work as I expected, but this would require nightly and I'm not completely sure what all the benefits/tradeoffs of both solutions are.

Thanks everyone for the help and suggestions!

What about doing something like this?

pub fn spawn<C>(context: C, func: fn(C))
where
  C: serde::Serialize + serde::DeserializeOwned;

And using something like JSON or Message Pack to mashal that context object between processes. Note that I'm using a plain function, not a closure.

The idea being that you are using Copy and memcpy as a proxy for being able to safely pass data from one process to another, this uses serde instead.

4 Likes

Unfortunately, even non-capturing fns can use statics, and statics can reference (leaked) heap memory, so even that's no guarantee. Interesting idea, though -- I can imagine it working in a slightly different scenario.

1 Like

Serialization and deserialization, as suggested by @Michael-F-Bryan, seems to be the only good generic solution for this problem. In all other cases, the approach is unsafe and the caller is responsible for not messing up. That's simply the nature of trying to send raw data to another process. It's basically FFI+.

2 Likes

Eventually I will need serialisation as I'm moving more into the direction of distributing processes over the network. For now I was just focusing on user ergonomics and closures seemed like a good way forward.

Ideally, I would like to serialize the whole closure, but I can't really figure out how this library works, my macro knowledge is still a bit lacking.

Also, I'm starting out here with Rust compiled to wasm, but the runtime I'm write exposes a few low level primitives and internally works just with opaque buffers. It should support any language that compiles to wasm. Not all of them will be able to expose a completely safe way of working with the low level primitives and it's ok if there are a few edge cases.

I think most people would be okay with a big warning in the docs saying that trying to "share" a static variable between the caller and the callee process will lead to unexpected results. People are used to libraries/frameworks placing extra restrictions on what they can and can't do, and this static variable issue is fairly common when you do multiprocessing.

I wouldn't go as far as making the function unsafe though, because safe code can't break memory safety if a static has a different (but still valid for that type) value than it expected.

EDIT: Ignore my comment on unsafe, @trentj pointed out an easy counter-example that would break memory safety.

No, this situation is a little different from the usual, because we're making (as far as I understand) a modified copy of the process's memory space. The normal problems with statics are that you can't rely on a global ordering, and they're not shared between processes. But in this case we're not starting a fresh process with known-good static memory section, we're (doing the equivalent of) copying over the static and code sections while leaving the heap and stack empty. In normal (hosted) program operation the OS is responsible for putting the static section in a known-good state before calling main (well, it's more complicated than that, but this model works to serve the point). By the point of calling spawn, we're already in main, so that initialization step has run, but then we wipe out the stack and heap, so the process memory space is in a kind of weird quasi-initialized state. This is not analogous to anything else I'm aware of.

But in this case you can make the static have an invalid value by making it point to something on the heap. Here's one way to do that:

use parking_lot::{const_mutex, Mutex}; // because std::sync::Mutex allocates

static THING: Mutex<&'static i32> = const_mutex(&10);

fn spawn(_f: fn()) { /* magic here */ }

fn main() {
    *THING.lock() = Box::leak(Box::new(20));
    spawn(|| println!("{}", *THING.lock())); // oops, dereferenced a dangling pointer
}

The idea of partially copying the process's memory space, retaining static memory but not stack or heap, seems fundamentally unsafe to me.

1 Like

One way to look at this: an iiuc conformant POSIX implementation of spawn would be to

  1. fork(2)
  2. spawn the thread
  3. stuff its id in a box
  4. panic!
  5. catch the panic in main()
  6. join on the thread id
  7. finally exit.

Think of this perhaps as a giant trampoline. The question then becomes, how much of the unwind/catch can be optimized out? (In destructors or statically?)

There are two obvious directions to go with "spawn the thread":

  • a proper thread, in which case your closure requirement is Send
  • or a coroutine, in which case the requirement is Pin and the closure is instead an async{} block. (And "join on the thread id" becomes "execute the actual code", perhaps with an assert that the poll returns on first try).

In the former case, because the main thread deciding to randomly unwind is legal, I suspect this would enforce that none of the captured data is on the stack, reducing us to "what parts of the heap can we get away with not copying". (In turn suggesting custom allocator hooks to track which pages actually stay used.)
In the latter case(spawn stashes future.boxed() in a StaticCell before diverging), I believe similar logic will apply.

I know the wasi principles doc mentions fork as a non-goal, in part due to the difficulties of implementing copy-on-write memory. (And indeed I would not force rolling your own with libsigsegv and mmap on anyone.) But it might be possible to get relatively far here with crash on write, esp if the API has an out-of-band way to say register Rc etc as "gc roots", whose pages are preemptively copied as they'll be mutated during the unwind.

Leaving the open question: does unwinding even work on wasi, or all possible configurations currently panic=abort? :stuck_out_tongue:

1 Like