Suppose we have wasm modules with very low memory requirement, say 16kb. We can run 1M of these with only 16GB.
Does either wasmer or wasmtime support switching 1M wasm modules with each running for 1 microsecond ?
Thanks!
Suppose we have wasm modules with very low memory requirement, say 16kb. We can run 1M of these with only 16GB.
Does either wasmer or wasmtime support switching 1M wasm modules with each running for 1 microsecond ?
Thanks!
Wasm engines often place unmapped multi-gig guard regions around each data region to reduce memory-checking overhead. This quickly exhausts the mappable address space on Intel. If wasmer and wasmtime take this approach, then it's not feasible.
Is this related to the "we don't have to do bounds checks for remote-execution shell code" because if you do an out of bounds access, you hit one of these guard regions, triggering some OS handler ?
Yes
I want to run some numbers. To the best of my current knowledge
x86_64 only use 48 bit ptrs
wasm heap + buffer comes out to 4GB = 32 bits
2^48 / 2^32 = 2^16 = 65536
So, even if nothing else is running, we are limited to 65,536 wasm runtimes per x86_64 machine ?
You can customize the size of the guard page in both of them. Of course smaller guard pages mean less bound checks elided and worse performance.
Wasmtime can use both static memory where 4GB + guard pages worth of address space is allocated for every module and dynamic memory where only the actually used part is allocated, but at the cost of bound checks everywhere with a significant (something like ~1.5-2x I believe) slowdown in many cases. You can use dunamic memory using config.static_memory_maximum_size(0)
.
The bound checks are for indexing the linear memory as a whole. Most of the time it is not statically known that a pointer is within the first 64k of linear memory. It may be possible coalece a bound check for p
and for p + 65535
with a 64k guard, but Cranelift currently doesn't support this, so I don't know how much it would save. Maybe ask at https://bytecodealliance.zulipchat.com/#narrow/stream/217117-cranelift?
I think I misunderstood bounds check. Suppose we compiled this Rust code to wasm32:
pub struct Foo {
a: i32,
b: i32
}
pub fn main() {
let t = Rc::new(Foo::new());
t.a; // <-- this could result in a bound check since it's indexing into the linear memory ?
}
All wasm loads and stores to linear memory are bounds checked. Both heap and stack go into linear memory except when rustc optimizes them into wasm locals.