Call for help implementing an independent Rust frontend for GCC

MS will probably/hopefully be implementing Rust some time, since they are really pushing it for safety in their systems programming areas at the moment. They even advertise a senior Rust job at the moment!

Yes, I have read some hints that MS is interested in Rust.

Historically I am no fan of MS. Quite the opposite.

More importantly though, I have long argued that the world should not make it's systems dependent on a single vendor, which for most is in a foreign country and out of their control. Which happened to be MS for a long time.

Here, the tables are turned a bit, it would be great to have MS supporting Rust. It's another option. It would attract the attention of many who would otherwise not take a look. It would motivate the evolution of some kind of standard among vendors.

In my opinion, MS will probably not create their own proprietary Rust compiler, they will share the same FOSS Rust compilers that the rest of us use.

From my (potentially flawed) perspective, Microsoft's has a proprietary C++ compiler simply because they built one waaay back and have kept developing it because that would have less unhappy customers than abandoning it. Clang (and GCC to a lesser extent) is to the point that most Windows C++ programs work just fine, so potentially, if Microsoft added C++/CX support to Clang or deprecated it entirely, Microsoft could probably switch all of their C++ code over to Clang and deprecate their proprietary C++ compiler.

4 Likes

Fair enough. So I did ask and here is the reply, which I quote:

GIMPLE is fairly stable but not 100% stable. The way the Go frontend
handles this is that the file that converts between the Go frontend IR
and GIMPLE is part of the GCC tree (gcc/go/go-gcc.cc). This file is
then routinely updated whenever three is a tree-wide GIMPLE change.
With that approach I very rarely have to change the GIMPLE generation
code myself.

So indeed stability isn't too much of a problem.

2 Likes

That's me indeed, but I've since given up on that because adapting rustc seems a lot less work than re-implementing the entire front-end. Either way, thanks for the tip.

Excellent, that's very encouraging.

Don't give up, think of a gimple adapter as less work to do for a new frontend. Also the extra exposure by having gcc users increases the bug detection rate.

It would not surprise me if their first priority was IronRust. A CLI runtime translator, so as to be able to integrate cleanly into .NET

I'm not sure there is a real issue with using LLVM for system software. At least one of the BSDs has switched to LLVM as its system compiler for example. It's perfectly practical. It's a bit challenging to compile the full Linux kernel under LLVM at the moment, but compatibility has gotten much better with time.

That said, if I recall correctly, RMS has said, on multiple occasions, that LLVM is a threat to free software. I have never quite understood this belief, since LLVM is free software, but I think it may be part of his motivation for wanting to avoid LLVM.

However, given the nature of Rust, and especially the fact that Rust continues to be under quite active development, I think that an independent Rust implementation might be quite a challenge to achieve.

When Stallman says "free software" he is referring to "Free Software". Note the capitalization.

That is to say software released under such a license that the source must always be made available when any derived binaries are distributed. That is to say the GPL or compatible license.

One can think of it in terms of the freedom of the actual actual source code itself rather than the freedoms of the human author/publisher/ distributor, or anyone else. Freedom for the source code to move around, reproduce, mutate, cross-breed, with no restriction by any distributor or recipient.

That is a kind of weird and abstract thought but I think in sums up Stallman's concept of Free Software.

I think he has a valid point.

One of the major goals of GCC is/was preventing hardware vendors from controlling developers (removing freedom to develop or modify software) via proprietary dev toolkits.

When hardware vendors only had a choice between buying or writing their own proprietary compiler vs contributing an open back-end to GCC, they could decide that opening up is a better deal.

But with LLVM now they have an option to take an existing good compiler, add a proprietary back-end, and keep everything closed, with no cost to them. In such situation Free (as in freedom) software misses out.

(There are lots of wonderful things a modular compiler toolkit can do, but preventing hardware vendors from making closed platforms isn't one. When FSF's actions don't make sense to you, remember that FSF's mission isn't to research and develop cool stuff, but to guarantee freedom to develop and modify software).

6 Likes

Crippling gcc by intentionally ruining the layering was RMS’s response to that worry, and is why LLVM had to be created: because gcc is too hard to work with. As an academic researcher, I could never imagine being productive working with gcc. Were it not for LLVM, many wonderful new compilers couldn’t exist. I’m glad it was created.

Exactly. Look how long it took for AMD to bother to release the sources of their modifications to LLVM to support AMDGPU.

Look at Apple's decision to stop releasing the source code of BSD Licensed kernel source, and how long it took them to comply (a decade ago) with releasing source code at all.

Look at the effect that google's android has had, where manufacturers not only ignorantly assume that because Android is apache licensed, therefore uboot and the linux kernel source must be as well, many manufacturers start integrating GPL'd versions of ffmpeg into Android as well.

Corporations are all about pathological take take take, and the GPL was the only defense against that, right up until people started viewing the FSF as dicks. How's that working out for everyone, with Cambridge Analytica only going to happen again and again?

Sorry for the rant - people forget that there are real detrimental consequences for users to using Apache2, MIT and BSD Licenses.

Some post overlap, here. You are absolutely right. Jeff Bush's Nyuzi uses LLVM for example. However the LLVM developers did not have to choose an MIT License, did they? They could have respected and understood why the GPL is used, rather than completely undermining everything that the FSF is trying to do for users, couldn't they?

1 Like

Moderator note: General discussion of gcc and llvm licensing philosophies is getting off topic for this forum.

10 Likes

I think I may have found a significant issue with libgccjit.

Skimming through the docs, I found a paragraph that seems to imply dynamic linking is not currently possible with libgccjit-produced executables or shared library objects:

GCC_JIT_OUTPUT_KIND_DYNAMIC_LIBRARY
Compile the context to a dynamic library.
There is currently no support for specifying other libraries to link against.

GCC_JIT_OUTPUT_KIND_EXECUTABLE
Compile the context to an executable.
There is currently no support for specifying libraries to link against.

This would be a massive issue imo - static linking everything would be absolutely terrible. I don't know how difficult it is to add dynamic linking support to libgccjit or if it is actually possible (for technical or legal/political reasons), but if the docs are correct, then I think this would rule out using libgccjit in its current state for the time being.

EDIT: note that this is in the AOT compilation section, so it does apply to using libgccjit as an AOT compiler

Can you use libgccjit to emit object files? In that case we can just perform the linking step ourself, just like with the LLVM backend.

Supposedly you can, but I would find it very strange that the linking process through libgccjit (which presumably calls system linkers like ld and whatever like regular gcc compiler drivers) cannot achieve something that you would be able to achieve if you had linked it yourself. Maybe the internals of libgccjit do not support dynamic linkage or whatever. I'm not really familiar with the intimate workings of the lower levels of compiling and linking but this certainly seems strange to me.

Also, that leads onto another, albeit less significant argument against using libgccjit vs a more integrated frontend - the possible outputs described (assembly, object, library, executable) imply that LTO (link-time optimisation) would not be possible with libgccjit. GCC's LTO implementation is language-independent and apparently better performing than LLVM's. This would allow better LTO for mixed codebases, e.g Firefox, while using libgccjit would prevent this fairly significant performance optimisation from being used.

Someone reasonably senior from the Firefox team said a couple of months ago at a meetup I attended that there is indeed some cross-language (C++ & Rust) link-time and profile-guided optimisation occur during the compilation of Firefox, which is possible because they use Clang for C++ and thus the languages have a common backend in their case. I'm not sure if you are saying you don't think that cross-language optimisation is possible with LLVM or not, so I thought I would mention the above just in case :slight_smile: (though this is admittedly irrelevant to the question of GCC)

Yeah, maybe I could've phrased that sentence better. I didn't mean to imply that LLVM's LTO was not cross-language.

As I understand, libgccjit can emit object files so there is no problem. We can do the linking ourselves. In fact, we do the linking ourselves even with LLVM.

As for code generation for dynamic linking, that would correspond to relocation_model of Rust's TargetOptions (either "static" or "pic" for dynamic). This corresponds to -fPIC and similar codegen options for GCC, and you can apply any GCC codegen options with libgccjit using gcc_jit_context_add_command_line_option, so again there is no problem.

If you're sure. I still think it seems a bit weird, though.

But it still doesn't address the issue of LTO. I don't know how LLVM's LTO works, but GCC's works by emitting GIMPLE bytecode in part of the object file, and so requires special handling in the code generator that libgccjit does not provide, to my knowledge. LTO obviously isn't required for a minimal working GCC frontend, but supposedly it allows 5% speedup in Firefox with LLVM, so I think the feature would be important if we want to leverage the best performance possible or maybe even comparable performance out of GCC. I think it may be a kind of "pain in the short term, gain in the long run" kind of situation. But then again, on the other hand, if there's too much work to be done, it might not get done. So I'm a bit conflicted here too.

Ideally, whatever would make Rust closest to a "First Class" member of the GCC suite of front-ends should be the ultimate goal. I would think, that if using libgccjit made that in any way not the case, then that should not be the direction pursued. Is there any long or short-term plan to move any or all of the GCC front-ends to using libgccjit? If not, that doesn't seem like the best way forward. Does libgccjit's intent align with the needs of a "First Class" Rust front-end for GCC? If not, then, the answer should not be libgccjit.

That's just my opinion.