Split a Rust closure into "plain function" + "struct"?

Consider a Rust closure of the form:

  let c = move || -> SomeReturnType { ... }

Conceptually, we can think of this closure as a
pub struct State { ... stuff moved ... } + fn plain_function(s: State) -> SomeReturnType .

My question is: can we access this struct State and fn plain_function in Rust code ?

======

The motivation behind this is as follows: I want to be able to send tasks from a wasm main thread to a wasm web worker thread. To do so, we end up needing to use: Worker.postMessage() - Web APIs | MDN -- which basically means we can only send things that we can serialize as JsValue in wasm_bindgen - Rust (and as far as I know, we can't do that to plain Rust closures).

Now, if we could decompose the above Closure into a struct State and fn plain_function, we can play the following trick:

  1. register a list of fn plain_function's (main thread and webworker run the identical same wasm, so they can both talk about 'function at index 23')

  2. If all the fields of struct State implmement some to_jsvalue, use some procedural magic to make the entire struct State to_jsvalue

====

Back to original question, as a Rust program / procedural macro, is there anyway we can access the "compiler internal" decomposition of a Closure into a 'struct State' + 'fn plain_function(...)' ?

If you have the type of the closure, say F:

fn plain_function<F : FnOnce() -> R, R> (f: F)
  -> R
{
    f()
}

plain_function::<F, _> // : fn(F) -> R

If you don't, and just have a value, you can add, atop it:

fn plain_function_of_val<F : FnOnce() -> R, R> (_: &F)
  -> fn(F) -> R
{
    plain_function::<F, _>
}

fn example ()
{
    let s = String::from("…");
    let f = || drop(s);
    let plain = plain_function_of_val(&f);
 // plain function part
 // vvvvv
    plain(f);
//        ^
//        struct part
}
2 Likes

I don't believe there's a way to apply a proc macro to the implicit structure definition, but you can¹ define your own structure that implements Fn{,Mut,Once} directly, and give it whatever other trait implementations you need.

¹ With appropriate nightly features

You can get a pointer to it and pass it around by value, but there's no way to access the closed-over fields.

Functions/closures are pretty much impossible to serialize or pass to other processes, though. Even something like JSON.stringify(() => {}) will return undefined instead of something useful.

Well, doing it in a universally deserializable manner (e.g. between JVM and C would work as easily as between JVM languages) seems impossible.

But one sane representation for fns could be the concrete (or even abstract) syntax tree. To account for differences between JS engines, it could be done in eg WASM text format, using S-expressions.

This approach wouldn't work for Rust since it's AOT compiled¹, but to me it looks feasible for JITted languages.

¹ Deserialization wouldn't be enough, it would require actual compiling. This is much less of a problem for JITted languages than for AOT ones.

This is essentially what Python's pickle module does, except it serializes a snapshot of all objects referenced by whatever is being serialized, plus the bytecode for class definitions and function calls.

That said, I would probably try to avoid serializing functions altogether. Passing around plain old data messages is a lot more reliable, transfers less data, and requires orders of magnitude less magic than serializing objects/functions.

It should be easy enough to pull message types into a shared crate and wire up some basic routing. Maybe throw in a custom proc-macro or some code generation if you find there's a lot of annoying boilerplate.

1 Like

This is what I do in practice as well, precisely because it's a lot more reliable. I even go so far as to avoid RPC calls for that reason (depending on how they're implemented), favoring dumb data messages instead.

My answer above was more of a devil's advocate POV :wink:

My initial post was not done very well. Let me try this again.

What I am currently working on:

This is not 100% working yet, still in progress, so some statements here may not be accurate.

#[derive(My_Jscode_Inner)]
pub struct Xos_Job_Add {
    a: u32,
    b: u32,}

impl Xos_Eval_Return_T for u32 {}

impl Xos_Remote_Eval_T for Xos_Job_Add {
    type Output = u32;

    fn do_work(self, context: &mut dyn Xos_Eval_Context_T) -> Result<Self::Output, String> {
        Ok(self.a + self.b)}}

I am working on something where a Main Thread can execute the following code, and have the addition be computed in a WebWorker.


let x = web_worker_manager.eval(Xos_Job_Add{ a: 2, b: 3}).await;
assert_eq!(x, Ok(5));

The way this works is:

  1. My_Jscode_Inner a trait for reading from / writing to JS. There is also a procedural macro for implementing that trait.

  2. So now, we can send a Xos_Job_Add as an tagged js_sys::ArrayBuffer. The tag is a u64, representing sorted order in the TypeId.

  3. At this point, Main Thread can send a (tag = u64, data = js_sys::ArrayBuffer) to the webworker.

  4. The webworker then basically does a list_of_funcs[tag](data) call, which returns a Xos_Job_Add::Output encoded as an ArrayBuffer, and sends it back.

What I don't like about this API

For every Job that I would like to send from MainThread to WebWorker, I have to define a Struct, then write the 'evaluation function' as a trait fn of the Struct. I would much rather write something like:

let x = web_worker_manager.send(move || -> u64 { a + b}).await;
// note here the lack of declaring struct Xos_Job_Add

Going back to original question.

This is what motivates the original question. Instead of being able to just 'send a closure', I have to define a new Struct, where I manually define the 'State' of the closure. Then I write the function as the fn do_work... implementing the trait.

I am trying to figure out if there is some procedural macro black magic that can be pulled here where I do not have to manually define a Struct for each closure.

By the way, in case it was not clear, I am heavily abusing the fact that Main Thread and WebWorker run identical wasm, so we generate the same list of TypeID, so we can get the right function to call from the tag/index derived from the TypeID.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.