Ssa / register allocation / graph coloring w/o cranelift / LLVM

Is there a crate for doing SSA -> registers via graph coloring without using cranelift / LLVM?

Context: I'm playing with x86_64 JIT-ing, do not want to pull in cranelift / LLVM, and am looking for a lightweight library for doing SSA -> registers.

It was non-trivial for cranelift, so why not reuse their work?

Register allocation depends on control flow graph and instructions, so it's probably not easy to just extract the register allocation algorithm (other than treating it as a generic SAT problem, I guess).

I'm refactoring wasmtime/crates/lightbeam at main · bytecodealliance/wasmtime · GitHub , dropped the dependency on cranelift-codegen, and am looking for a very light weight crate for doing slightly more complicated register allocation

Cranelift's regalloc-rs has a linear scan register allocator as option. It is not very optimized though I think.

Lightbeam is meant to be a single pass compiler with a guaranteed upperbound on the time it takes to compile things. This is necessary for applications like wasm based blockchain where untrusted programs are compiled. You don't want someone to be able to cause a denial of service by exploiting quadratic behavior in the compiler. Being single pass is fundamentally incompatible with a smart register allocator. You have to look only at the register assignment at the current position and the current and maybe next instruction. Looking any more in the future is impossible.

The main reason we decided to use a streaming compiler at Parity is that the strict bounds on compilation time are necessary in the context of the blockchain - currently we use our own in-house WebAssembly interpreter.

1 Like

Linear Scan Register Allocator is very interesting. It sounds like a heuristic for graph coloring.

Common setup:

  1. We compute the lifetime/range of every variable.
  2. We run through the lifetime from start to end, adding it to the active set at the start of the range; removing it from the set at last use.
  3. What do we do when the active set > # of registers ?
    ====
  4. This appears to be where the heuristic for LSRA comes in: we look at everything in the active set, and spill the variable whose range ends latest.

Is that the gist ?

======

I'm actually willing to drop this requirement. (There's no way to infer this from my original question, but) I picked lightbeam not for streaming/blockchain reasons. I picked lightbeam because it looked like the simplest (in terms of lines of code + dependencies) to go from wasm -> x86_64.