Call for help implementing an independent Rust frontend for GCC

This is why I asked Dr Stallman (just now) if mrustc would be ok. It's smaller, and MIT, and if nothing else could conceivably be replaced later with a copyleft equivalent.

Of course, it is 100% perfectly legal/acceptable to relicense something that is MIT licensed as GPL is it not?

If you are the Copyright Holder, and have not entered into any 3rd party agreements that limit your rights to do so, then of course.

If there are multiple Copyright holders then 100% unanimous agreement is required. This of course presents a bit of a problem if some of them are dead (as you still have to contact their Estate and seek permission from them instead!) or if their email address is invalid (you still have to track down the person) or even worse if they worked for a Corporation and cannot get permission.

All of which and more is why the FSF requests Copyright Assignment for GNU gcc contributions.

You cannot take someone else's code and slap a different license on it. That's basically theft.

So the mrustc developer(s) would need to be contacted if that is what you were thinking ahead on.

Doesn't the MIT license explicitly permit this kind of relicensing though? Or am I thinking of the BSD 3-clause or something like that?

From Wikipedia:

The MIT License is a permissive free software license originating at the Massachusetts Institute of Technology (MIT)[5] in the late 1980s.[6] As a permissive license, it puts only very limited restriction on reuse and has, therefore, reasonable license compatibility.[7][8] The MIT license permits reuse within proprietary software provided that all copies of the licensed software include a copy of the MIT License terms and the copyright notice. The MIT license is also compatible with many copyleft licenses, such as the GNU General Public License (GPL); MIT licensed software can be integrated into GPL software, but not the other way around.[9]

Is this not accurate? It seems from my recollection of review of licenses that this is true.

OK, I may be talking across you here. I was thinking it would be OK to integrate and keep the MIT license on it, but, because of the "requirement" for copyright assignment, it would need the permission of the original author(s) to do that and actually truly re-license it as GPL.

Yes, I see your point. I guess opening those discussions with the author(s) of mrustc might be taken as a first step then to see if that is a viable way forward?

EDIT: Opened an issue on the GitHUB for mrustc to see what kind of re-licensing might be possible (if any): License Compatibility Question: Possible to Dual License to GPL? · Issue #120 · thepowersgang/mrustc · GitHub

No.

The MIT license is "compatible" with the GNU GPL, meaning you can create and redistribute a single program that contains code under both licenses. Because the GPL does not allow you to redistribute derived works with additional licensing obligations (the so-called "copyleft" requirement) it is not compatible with all free software licenses, but it is compatible with the MIT license, because its obligations are a strict subset of the GPL's.

But you're not actually changing the license when you do this.

Heard back from Dr Stallman, he said that MIT Licenses are not a problem including for linking.

Also he was really excited to hear that people are interested to work on this, and suggested to reach out to the gcc gearheads, which I will do shortly.

He suspected that mrustc's performance due to it being a language translator would probably suck, and I reassured him that its primary purpose here is for bootstrapping.

So yes, real possibility of gcc having a rustlang frontend... in rust, which I think would be particularly interesting, to actually showcase properly what rust can do.

One small feature request, from me: there has been some slow burn failure in the compiler / linker world to appreciate 32 bit hardware, and the critical importance of staying within resident RAM.

The RK3288 for example is a quad core 1.8ghz Cortex A17 system, capable of 4k HD Video playback and can physically address up to 4GB of RAM. It knocks the stuffing out of most mid level Intel Celery processors of only 2 years ago.

Yet because of the assumption by compilers and linkers that swap space will go beyond 4GB, it cannot ever be used even to compile its own source code, requiring cross compilation in many cases, from a 64 bit platform.

Also it is really critically important to keep within the resident RAM for a compile and a link. Going into swap space can result in swap thrashing due to the data crossreferencing being so large that the working set is the VM pageset, making linking and compilation take a HUNDRED times longer or greater than it should (or, in the case of my 16GB DDR4 RAM 2500MB/s NVMe Quad Core i7 laptop, causing its loadavg to reach 160 and die within about 30 seconds).

I have heard from many people that rust-llvm often requires greater than 4GB RAM to compile some applications. With other compiler communities also not appreciating the problem, over time this has the unfortunate side effect over time of relegating 32 bit hardware to landfill, as distros are beginning to discuss abandoning it as the number of packages that fail to compile is increasing inexorably over time.

4 Likes

I see that there are a few distinct ways/levels of integration of Rust with GCC:

  1. Enabling Rust to run on platforms supported by GCC
  2. Letting Rust be a first-class citizen in GCC
  3. Having independent Rust implementation for bootstraping "trusting trust"
  4. Having independent Rust implementation for bureaucratic purposes

The distinctions are important, because they allow or disallow different shortcuts in implementations.

  1. To just target more platforms, it could suffice to hook up existing rustc front-end to GCC back-ends (e.g. by translating MIR to something gcc can use). For such narrow goal, Rust doesn't even have to independently build as part of GCC, and building of the front-end could still be done with llvm-rust.

  2. This would require Rust, or at least limited subset of Rust, to bootstrap itself in GCC, LLVM-free. But it could still use code/libraries extracted from the current Rust implementation (e.g. bootstrap without a borrow checker, but then instead of having own GCC-borrow-checker, just adopt rustc's existing borrow checker implementation wholesale). I've heard type inference is hard for mrustc. Rust-GCC could bootstrap with minimal type inference, and then take and compile rustc's.

  3. Some people already use mrustc to bootstrap a trusted Rust compiler. Since that's a one-off operation, it doesn't need to be particularly fast or well integrated (it'd be nicer if it was, but doesn't have to).

  4. A completely independent implementation, that would satisfy standards/certifications, probably can't reuse any substantial bits of the other implementation, so it'd require building everything from scratch. That's interesting and could let implementations take very different approaches, giving them unique strengths, but it's of course much more work.

5 Likes

It seems very impractical to get everyone who's contributed to significant parts of rustc to assign their copyright to the FSF.

2 Likes

Argh this forum is stopping me from posting properly, can someone deal with that?

Is this of any value to get things to where people would like them to be?

"Thus, we grant back to contributors a license to use their work as they see fit. This means they are free to modify, share, and sublicense their own work under terms of their choice. This enables contributors to redistribute their work under another free software license."

That's interesting. So they actually get their code back in effect. It just means that they become an irrevokable independent unrestricted sublicensee of a version of what was formerly only their code, whilst the FSF and its resources become responsible for protecting the code from Copyright infringement.

I didn't realise that, I always thought it was a oneway assignment with no sublicense granted.

… will lag rustc and rustc's features by months to years. Since rustc and the Rust language itself are moving targets, with new features released every six weeks, any implementation by a separate team that develops its own approaches to borrow checking, etc., rather than reusing the detailed approaches of rustc, is unlikely to be more than 90% compatible in terms of the source code that it accepts and rejects. It's hard to conceive of a way to achieve parity between separate compilers if there is almost no cross-fertilization, but such isolation seems to be the only practical way of avoiding shared errors in implementation.

4 Likes

C++ compilers have historically had similar issues with incompatibility, but over time they've gotten closer to each other and to the spec, as compilers and the language itself have matured. Something similar might eventually happen with Rust. We have the disadvantage of lacking a written specification; on the other hand, we do have a single "known-good" implementation to compare new implementations against, whereas in C++ there were/are many implementations with equal status. Hopefully, if a full-fledged alternative compiler does get developed, it will provide an impetus for properly specifying the thornier parts of the language, like borrow checking and trait resolution.

5 Likes

I heard from the gcc steering committee, it has been a fascinating conversation.

gofrontend - Git at Google apparently does something similar, by borrowing libraries that end up actually in gcc-go. As there is precedent here, the only thing needed would be the FSF's approval for the license used in any rustlang libraries.

On that, Dr Stallman tells me that one of the MIT licenses is fine (there are two variants).

My feeling is that using the borrow checker and other components is going to be crucial for actually having a useful up to date gcc-rust compiler, just like golang does. When things stabilise in a few years, a different approach could be taken.

ARGH had to edit this reply due to limitations, please could an admin look into that, it is getting tiresome and making the conversation awkward for readers.

--- added here

Question was asked: would it be reasonable to simply modify the existing rust compiler to get it to interface to gcc's backend (probably at the RTL level, RTL (GNU Compiler Collection (GCC) Internals))

This would be a much easier task.

Any thoughts?

-- another edit sigh

https://www.mail-archive.com/gcc@gcc.gnu.org/msg89085.html

Someone else worth reaching out to.

Apparently GIMPLE is what frontends are supposed to generate in order to interface with gcc. The above discussion talks about MIR to GIMPLE, and how MIR does not seem to be stable / documented. Has that changed since, and is it really as straightforward as adding a MIR to.GIMPLE option alongside MIR to LLVM-IR?

1 Like

My understanding from watching the compiler threads on Zulip is that MIR is still evolving, but at this point rather slowly. Thus it might be possible to pre-stabilize enough of MIR to make a MIR to.GIMPLE converter that could be updated whenever MIR was extended. Whether such a converter is straighforward I don't know. However, since other compiler chains target both GIMPLE and LLVM, presumably rustc could as well.

Do note that we are no longer talking about an "independent" frontend, but rather the same frontend targeting GCC via GIMPLE.

1 Like

I think rather than using MIR, librustc_codegen_llvm would be the crate to swap out with an alternate backend. rustc is already able to load this dynamically, as the emscripten target uses a different LLVM build. But note here, "This API is completely unstable and subject to change."

So you'd have a development question how to stay in sync with this. If your librustc_codegen_gcc lives in the main rust repo, it can be maintained together. That would be external to GNU, so I don't know if that meets your goals. Otherwise, I guess you'd have to snapshot the rustc crates into the GNU project, develop against that, and try not to let that get stale.

having rustc be able to use gcc as an alternate backend sounds like a good idea regardless of if gcc gets a new rust frontend. it probably wouldn't meet the safety critical requirements for an independent implementation, though.

Exactly. It's difficult to envision how such a parallel development could be kept up-to-date with a rapidly evolving rustc while limiting cross-fertilization between the two developments enough to avoid shared algorithmic errors. Even closed-room independence is insufficient, as isolated programmers have been known to make the identical algorithmic mistakes when presented with identical problems.

2 Likes

From my experience of writing a cranelift backend for rust, I can tell you that breaking changes om average occur about every few days to every few weeks. Most of the time the changes are simple, so it is quite possible to keep up out of tree.

5 Likes

In offline conversations this morning with the gcc maintainers they confirmed that GIMPLE is their primary interface for frontends, and improvements get added there, on demand, not so much to RTL, which has a different purpose.

If either MIR is modified or if MIR is replaced, it does not matter: whichever is easiest, GIMPLE needs to be outputted, either way.

With both golang and a new compiler called "D" both being external and acceptable, the answer would correspondingly be yes.

Thus two disparate release cycles could be met: both those of rustlang and of gcc, by using GIMPLE as the stable intermediary.

Indeed. However there is something very important about the approach: it is incremental and minimalist. Also, if it was part of the main rustlang compiler, breakage would be quickly detected as there would be unit tests that had to be passed on every release (I assume!)

Also, correspondingly, if it was part of the main rustlang compiler and a new independent project was started, that effort would not be so isolated because one of its major critical components - the MIR to GIMPLE library - was being kept up to date and tracking recent rustlang developments.

1 Like

Would someone be able to expand on this a bit, please? Certainly, I personally have no objection to Rust being included in the GCC, but I don't know what these relevant specs are that require an independent implementation of the compiler (or, more generally, why an independent implementation is needed).

1 Like

Jarak,

Let me offer an example:

I worked for a year or more on the team doing integration testing of the Primary Flight Computers of the Boeing 777. The last layer of testing before actual flight testing. The PFC's are the "fly by wire" computers responsible for gathering all the flight sensor inputs and driving all the control surface actuators. They translate inputs from the joystick, rudder peddles etc into actuator motions. They keep the plane stable.

Needless to say this is a safety critical role and to that end there are 3 PFC's operating "redundantly" constantly cross-checking each others work and arriving at a consensus output. Thus a failure of any box could be tolerated. In turn each PFC box contains 3 redundant processor boards of three different manufacturers, Intel x86, AMD 29K, Motorola 680x0. The idea being that if any one of those architectures failed due to some hardware issue the others would detect that and continue correctly.

Now, the plan was that in order to protect against software bugs the control software would be developed three times by three different teams, in the hope that no two teams would create the same bug in the same place and that majority rule would detect such a software error at run time. In turn three different Ada compilers would be used so as to be able to detect and tolerate a bug in generated code occurring in any one of them.

In the end the software was only written once by one team. It was too expensive to do three independent developments. I'm not a 100% sure but as I recall only one Ada compiler was used. Although it would have had different code generators of course.

Since that time it seems the idea of multiple redundant development has become less significant as it was found that it is quite likely that different teams will indeed create the same bugs given the same specification. They can all make the same mistake in interpreting a requirement. Which would be detected by unit, integration and other testing.

So I wonder, who is it today that would require multiple compilers from multiple vendors?

20 Likes