Share memory across heterogenous WASM modules

fyl2xp1 · July 5, 2022, 7:24am

I successfully managed to let my multi-threaded code run on WebAssembly in the browser. The most helpful resource was Multithreading Rust and Wasm | Rust and WebAssembly from @alexcrichton

I understood that multiple instances of the same code can be run on the same memory because they all share the same memory layout and the initialization is aware of the fact that other instances might already be running.

I now have a different use case: I'd like to load different modules (binaries) that don't know each other into the same memory. I could imagine the following approach:

Load and initialize a memory manager module which serves as a global allocator for multiple threads.
Compile all modules that should be loaded during run-time against this allocator
The host allocates some memory and loads the application module into that region; then init the instance
All loaded instances will use the global allocator in order to not corrupt the memory of any other instance

It's comparable to have a minimal OS which hosts several applications. This WASM-OS could also provide other cross-cutting functions like logging, etc. … but I'm distracting.

There would be the following memory regions:

static variables of the memory allocator
memory shared across multiple instances of the same module
private memory for each instance
global heap (which might already cover the private memory)

Some problems I can imagine (overall complexity aside):

modules would be able to communicate via shared memory but I'm uncertain whether they could call a function of another module. It might be necessary to pass those requests through the host.
I'm not sure whether a module could be loaded to an arbitrary location of a memory segment (without corrupting any pointers to static variables)

Any idea if that is realistic or feasible? Please be lenient towards me as I'm still new to the inner mechanics of module initialization and I might not have the right terms at hand.

jjpe · July 5, 2022, 8:35am

Sounds like an uphill battle. IIRC WASM code runs with strong sandboxing guarantees, and running multiple mutually untrusted wasm modules would undermine that.

In addition, WASM32 suffers from the same memory limitations as any 32 bit arch, which is that it in total can access a maximum 4 GB of memory. This would be shared between all your modules as well as whatever the runtime itself does to manipulate the WASM. The latter one likely won't be much but it's worth mentioning.

I'm not saying it can't be done but if it can, it in all likelihood won't be easy.

fyl2xp1 · July 8, 2022, 3:06pm

All modules will be loaded from the same trusted source but at the moment it's a big fat monolithic binary and I'd like to split it up. This will also reduce the overall memory usage, as you only have to provide the memory for the modules you actually use.

The allocator can actually be compiled into each module as long as all of them shared the same data structure in a thread-safe way. Is there a way to initialize a global allocator? Who's responsible for creating the datastructure?

The other question is, whether a compiled WASM module is relocatable (does not contain any absolute memory addresses)?

I still need to find out whether the stack lives in the linear memory or has its own private space.

bjorn3 · July 8, 2022, 4:50pm

This is not the case most of the time. Unless you explicitly enable PIC (or use latest nightly for emscripten as we flipped the default there), we currently use the static relocation model by default. Fully enabling PIC requires using eg -Zbuild-std to recompile the standard library with PIC enabled.

fyl2xp1 · July 8, 2022, 8:59pm

Thanks for the pointer! I had a look at Codegen Options - The rustc book which states for pic "This is the default model for majority of supported targets.". Which sounds like there's nothing special to do.

build-std is also a good hint. But according to the above statement I would've expected, that the shipped version is been compiled with PIC enabled as well.

bjorn3 · July 9, 2022, 10:23am

The wasm targets are one of the few targets that don't use pic as default relocation model.

github.com

rust-lang/rust/blob/c4693bc946729393c087fb120af566395915d19d/compiler/rustc_target/src/spec/wasm_base.rs#L104-L111


      
          // This has no effect in LLVM 8 or prior, but in LLVM 9 and later when
          // PIC code is implemented this has quite a drastic effect if it stays
          // at the default, `pic`. In an effort to keep wasm binaries as minimal
          // as possible we're defaulting to `static` for now, but the hope is
          // that eventually we can ship a `pic`-compatible standard library which
          // works with `static` as well (or works with some method of generating
          // non-relative calls and such later on).
          relocation_model: RelocModel::Static,

system · October 7, 2022, 10:24am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Can I Leverage Multi-Memory in in wasm32 Target? help	4	709	February 25, 2023
Rust & Wasm multi-memory feature	6	1915	April 28, 2021
Rust / wasm multi threading	4	1753	January 20, 2021
How rust is dealing with string in memory from its own runtime in webassembly? help	4	652	June 27, 2020
Static allocation for Webassembly with RefCell help	10	2634	January 12, 2023

Share memory across heterogenous WASM modules

Related topics