[PROPOSAL]: RLLVM (Rust implementation of LLVM) - aka Just use Cretonne

@gbutler69 Also, have you seen Inkwell (GitHub - TheDan64/inkwell: It's a New Kind of Wrapper for Exposing LLVM (Safely))? It's a native Rust wrapper for LLVM aiming for user-friendliness. I think improving the usability of LLVM from Rust is a better approach than rewriting it from scratch (though I do wish that it was possible to write an LLVM backend in Rust and I've toyed with the idea of making an attempt, though the fast-moving nature of LLVM makes it a pretty daunting task)

1 Like

No, I hadn't come across Inkwell yet. Thanks for the link and the info.

It seems like the verdict (at least for now), based on the community response and feedback as well as the feedback from Cretonne core-developer @sunfishcode is that something like RLLVM has no useful place in the world given the community inertia around LLVM and the progress and goals of Cretonne WRT Rust.

That being said, I welcome any differing ideas or opinions that might change that calculus.

Nice to hear you'll be joining Cretonne! I want to clarify a few things still:

Not very much, I think. LLVM's current position is not just due to being great technology, but also due to timing (being in the right place at the right time), politics (one of its chief distinctions from GCC was and is the license), luck (Apple throwing its weight behind it made a huge difference), and social (e.g., it came out of research and was explicitly promoted as tool for researchers). Replicating LLVM's rise requires not just improving upon it as LLVM improved upon earlier projects 15 years ago, but also something akin to the other factors.

The examples of desirable changes I gave concern precisely the LLVM IR, not implementation details. Most others I can think of are also like this. The LLVM people are rather good at changing implementation details and migrating APIs, but touching the core of the IR is much more involved.

Note that "the parts after LLVM IR" don't just encompass machine specific optimizations. They include

  • instruction selection: how do you even map target-independent operations to machine instructions
  • register allocation
  • scheduling: in what order do you organize the instructions
  • legalization: how to replace types and operations the machine doesn't support at all with ones that it supports
  • etc.

All of these steps have to be done in some way to produce machine code at all, but doing them better or worse greatly impacts codegen quality. For some of these tasks there are relatively simple solutions ā€“ regalloc can be per-statement, scheduling can be inherited from the input program, etc. ā€“ and these give positively atrocious code (some more so than others, bad scheduling you may not even notice in many cases, but naive regalloc is incredibly bad).

But yes there are also quite a few important optimizations in LLVM backends that you'd absolutely want to have in any production quality compiler, and which would be very difficult to do on a target independent representation.

There are alternative approaches that use one IR longer throughout the process (Cretonne being an example), but LLVM IR is ill-suited for that. There are good reasons why LLVM IR is mapped to other IRs as soon as codegen starts.

My point was that more fine grained parallelization (than separate modules, which already works fine) is not very high on the wish list. It would be nice, but, well, only nice and not more.

Two things: First, this prioritization seems contrary to getting other LLVM users and developers on board. Second, as I said, a lot of redundant work could be avoided by adopting what LLVM developers already know is "a better way" or "the future" but can't implement easily due to inertia (or are currently in the process of implementing). For example, instead of porting LLVM's Selection DAG infastructure and then scrap it when GlobalISel is finally finished in upstream LLVM, it would be much better to directly adopt the GlobalISel design (provided one copies LLVM at all wrt instruction selection).

5 Likes

Interesting read! When you talk about VSDGs, how similar is that to sea-of-nodes IRs like V8 Turbofan or Hotspot C2? I see research on both of them citing the same papers and they sound basically the same but I'm not really familiar with the term.

1 Like

@hanna-kruppe - Thanks again for the detailed response. This is all extremely helpful to my understanding of the larger issues. I really appreciate you taking the time to respond like this.

A key characteristic is control flow versus control dependence. Control flow means there's a CFG as part of the graph, which has the advantage of being easier to schedule into machine code at the end. Control dependence means converting control flow into a form that looks like data dependence, so everything is dependence, which is fun and powerful, however it's more work at the end. The VSDG is based on control dependence, whereas I've heard C2 uses control flow.

Cretonne isn't committed to the VSDG at this point. But the VSDG is a nice example here because it's an IR that takes some amount of work just to build and then lower, so you wouldn't want to build it in -O0 mode, but maybe O2 will be doing enough work that it's worth it. And it just might be a good platform for superoptimization, because it tends to make control flow patterns tree-shaped. For example, where control flow represents an if-else as a diamond, control dependence represents it as a node that has three inputs: a predicate and two control dependencies. Loops can often be factored such that the cyclic part looks like a single node, making simple loops look like trees too.

3 Likes

One thing that comes to mind here is that LLVM crashing is annoying, but far less scary than LLVM mis-optimizing code. And Rust's safety checks, unfortunately, don't actually help ensure that the LLVM IR or machine code being produced is safe or correct--once you have the BasicBlock graph manipulation support, it can do anything.

3 Likes

C2 (at least at one point) and Turbofan also use control dependencies instead of a CFG, so I suppose we might have just read different papers on basically the same thing.

Edit: another example is libFirm which actually cites both Click's sea-of-nodes stuff and Johnson et al's VSDG stuff.

C2 still has the same design. Graal, the projected replacement high-tier JIT for C2, uses the same sea-of-nodes with dependencies approach.

Edit: Chris Seaton (Oracle Labs) had a nice blog post a few months ago talking about Graal: Redirectingā€¦

1 Like

The fact that it tends to be used for (admittedly higher-tier) JIT compilers makes me wonder if it's actually not too expensive for quick builds. Either way, certainly looks promising. :slight_smile:

I believe libfirm uses control flow. Take a look at the "Conditional / Phi" example here. The predicate is consumed by a "Cond" node at the top, which produces two control edges, which are inputs to the two arms. In their 2011 paper, they cite the VSDG to say that it's a possible idea to explore.

I don't have personal experience with C2, but the "Region and Phi" example here also seems to show a condition node at the top producing control edges.

In a pure dependence graph, there is no node at the top of an if-else. Instead, there's a select-like node, where the side effects of both arms are represented as if they were simply two plain values that one could select between.

Theyā€™re also using Graal for their new AOT compiler. But in this mode thereā€™s no PGO so not many optimizations are done.

Ah, I see what you were saying. C2/libFirm/Turbofan all use both "if" nodes and "phi" nodes, so they still contain a typical control flow graph. It's just that a) its edges are expressed as dependencies rather than the usual basic-blocks-as-branch-targets, and b) the other nodes aren't contained directly in the CFG nodes, and are thus scheduled similarly to those in a VSDG.

On the other hand, a VSDG essentially uses only one node for this purpose, in the position of a sea-of-nodes "phi"? I wonder how different that actually is in practice.

I prefer to minimize opportunity cost by investigating something and getting the facts straight before I invest significant time in it. That being said, I agree, working on this one could learn a lot regardless of the viability of the ultimate project. However, I think I'd prefer to spend my time on something that has a reasonable chance of success, not something that is almost guaranteed by industry inertia to fall flat.

Thanks for the input though!

True, but, data races for example in the code base could result in generation of incorrect output that would be undetectable and not evident from the algorithms.

I donā€™t think LLVM parallelizes passes so this point is moot from its standpoint. I think @scottmcmā€™s overarching point was Rust isnā€™t a panacea against the type of bugs that LLVM deals with, and a Rust version would too.

I fully agree with @rpjohnstā€™s posts though - porting LLVM as-is to Rust, without sufficiently innovating anywhere, is doomed for failure - it doesnā€™t bring enough to the table and doesnā€™t solve the right problems to sway the existing LLVM community to just switch to Rust (assuming most of them even like Rust and hate C++, which Iā€™m not sure about). It could be an interesting learning experience though, but Iā€™d leave it at that.

2 Likes

At the risk of pulling a Kanye, Cretonne is an awesome project but it doesn't excite me as much as the various WASM Jit projects that try to execute WASM efficiently. This is an area that is ripe for development.

  1. Browsers will have to run WASM so it's possible that a browser could adopt whichever project that takes a lead here. Efficient execution of WASM is a project that needs to exist.
  2. WASM can be a lingua franca for analytic systems on a network. e.g. If you run a job that needs to serialize, e.g. taking in a user string (=SUM(A1:A24)), user program (filter.awk), or even a closure (with a language plugin) and compiling it to WASM so it can be sent over the wire to a worker will be a powerful feature for Rust programs. An embeddable WASM runtime is a project that should definitely exist.
  3. If WASM takes off as we think it will, then it may well become the lingua franca position that LLVM-IR enjoys today. Tools for generating and optimizing WASM will be important - even those that take WASM and AOT compile them to native code. People may write their compilers to generate WASM with your/our/some WASM->Native compiler as the code generator. Hey Presto! RLLVM through the back door!
4 Likes

Cretonne is specifically designed to be the new WASM/JS JIT back-end for Servo/Firefox implemented in Rust (or at least that is my understanding).

Not to belabor the point, but, parallel passes could be an issue WRT a new implementation where Rust safety guarantees would be a benefit (potentially). But, as has already been elucidated in the comments here by multiple people, there just isn't any good case to be made for an RLLVM primarily due to industry inertia and the fact that it would be better to start with new/more modern ideas, which is exactly what Cretonne aims to do.

EDIT: I just realized (in my after work-out stupor) that I hadn't really fully read your comment and my response pretty much just re-stated what you said. Derp!

If that's the case I can only suggest focusing on making the best WASM JIT back end for Servo/Firefox and keeping an eye on making it embeddable. Replacing LLVM is years off at the earliest.