Would it be a good idea for the Rust programming language to have a specification? Wouldn't it be nice if our language, or mother tongue, had a specification? If there was as many implementations as we needed, each focusing on something different? If the specification wasn't managed by the ISO and didn't cost money to access? Opinions?
I find this blog post by Mara Bos an intriguing read concerning your question:
I think it would be nice if there was more choice, instead of one official compiler. I think that multiple compilers could help (let's say you needed some cool features, but they aren't part of the official Rust compiler). I think generally, you can add an extra feature, but don't make a compiler without standardized features.
There are multiple projects working on Rust compilers other than rustc, like the GNU community working on supporting Rust and the ferrocene project, of course.
@jofas What even are your opinions? You didn't really explain? You agree with the blog post you linked I presume?
I do, yes (though I think you should put far more weight on the fact that it is the opinion of a highly respected engineer that has significantly contributed to the Rust project, rather than that I agree with it). I find it somewhat frustrating working with standardized languages where the standard is independently developed of any compiler, which in my case have been C and Fortran. Writing a program that uses Fortran 2008 features (in 2020!— Fortran 2008 got approved in 2010, so that's 10 years after the standard got approved) and compiles on GNU but not the intel compiler and vice versa with different features is not something I'm missing.
You mean, let's all go back to the glorious days of writing C(++), where a dozen different compilers each implement their own pet sub/super/extra-set of the language, with inconsistent behaviour in the intersection of features, with no portable way to solve common issues, so that programmers need to write to the lowest common denominator, catch random bugs when porting between compilers, and dynamically determine compiler features in build scripts?
Hell no. No, no, no. I left that rotting pile of garbage, and I'm way happier now.
Also, Rust solves this quite nicely with the nightly toolchain, in my opinion.
I think it would be nice to have a formal specification (an actual formal and machine-checked specification, not an informal one written in human language like the C++ one). However I don't think having multiple main compilers would be that good, I fear we will just end up having to deal with more compiler bugs.
"Cool features" are usually needed in libraries, but libraries should also be as compatible as possible, so requiring an unofficial compiler would be against their interests. In other words, imagine if library 1 required unofficial compiler A but library 2 required unofficial compiler B, and you wanted to use both of them at the same time.
A major note is that a language containing extensions can not be called "Rust". That's why it's trademarked — it provides control over the name.
No, please absolutely don't.
There's no "cool features" missing from Rust. If the language suffers from anything it's feature creep, not the lack of features. We have more than enough features already.
Most people who think they "need just one more feature" either don't know the already existing features or the library ecosystem well enough. No, you don't need self-references. No, you don't need class inheritance. No, you don't need unchecked shared mutability. No, you don't need garbage collection, or dynamic typing, or generalized monads, or specialization, or whatever the Feature Du Jour in Today's Hot JavaScript Update is.
Haskell / GHC is an example of language with a single compiler (now, there had been more decades ago) and a specification: https://www.haskell.org/onlinereport/haskell2010/.
C# (not F#) has a specification:
https://www.ecma-international.org/publications-and-standards/standards/ecma-334/. Java of course too.
As you can see, the range of languages with specifications goes all the way from "avoid success at all cost" to, well, Oracle and MS.
And as there are going to be other compilers (at least the GCC frontend) it would be beneficial for Rust (the language) to have one. The state of C++ compilers had been way worse before the 98 Standard got adopted by most of them (I'm not old enough to having experienced the time between Fortan 77 and 90 as a Fortan user ;).
Eh, there's plenty of things I would count as "cool features" that Rust doesn't have. Listing features that are not very cool isn't proof that cool features don't exist!
That said, there probably is a nice "small Rust" that has a very small spec but has some subset of "essential" features to give it the Rust flavor: but it's pretty hard for me to figure out what that would be.
I would think it would need at minimum:
- Lifetimes
enum
types- Traits: both
dyn
and generic bounds - FFI /
extern
- Safe by default /
unsafe
And that's enough to make a pretty chunky language already.
Edit: the point is that most of Rust is already just closing the set of the above and filing the pointy bits off.
… and of course there already is
which is maintained by a third party, https://ferrous-systems.com/, for purpose of qualifying Rust for safety critical systems. It is, and always will be, some versions behind, but it is (has to be) accurate and the specification itself is free.
I'm a java programmer for 20 years, and I'm learning the Rust language. In very humble baby steps.
So far, what I can see is that everything in rust is consistent. The fact that most of the documentation in the standard library points out to a source code that implements that specific feature doesn't give room for doubt or details, caused by black boxes.
I come from IBM. And IBM java. It is a Java Certified java, which means it is "100% compatible". But at the same time it has some "extra features". Can I pick a simple java project and compile using the IBM java? Yes, for sure!
Can I pick a project that has been developed in and out for 20 years, always with IBM java exclusively, and suddenly try to build / run it with OpenJDK java? Not quite, no, because developers in those 20 years of history ended up using internal features of the IBM implementation... even by accident, because it was "just there": simple stuff like base 64 encoding, cryptographic algorithms, or XML processing.
The end result: there was this IBM x Kyndryl separation, and now I'm on Kyndryl side. Suddenly all the support channels I did have "informally" to the IBM java developers vanished into thin air. Which made IBM java very unattractive. And is giving quite some pain to make adjustments to this legacy code to work with OpenJDK java...
The fact that Rust has One source avoids this sort of blunder.
Definitely I wouldn't want ISO to keep under lock and key and behind a paywall the language specification. Last week I was looking into the Extended Bachus Naur Format, and saw "hey, there is this ISO specification, let's see the actual primary material at ISO..." BAM! Paywall! 35 bucks to see a 50 year old specification for which I just have casual interest?? No way.
In what way is a single reference implementation (rustc
) inferior to a prose specification? Languages (in general, not just for programming) naturally evolve over time, and specifications tend to either lose step with current usage or inhibit the process of adapting to changing circumstances.
Personally, I'd prefer all those hours of work to go into improving a single open-source implementation instead of being split amongst many different efforts— If you have some highly-experimental idea to try out, you can always make a temporary fork to demonstrate it with.
Perhaps. I would want to see some stringent requirements on such a specification:
-
It should be human readable and preferably fairly simple and straightforward to understand. That is to say somebody with no knowledge of Rust should be able to figure out how to start writing programs in Rust from the spec. alone. No tutorials, no examples etc. By contrast to the C++ standard which is barely a readable document.
-
It should be machine readable and mechanically verifiable. Not something open to misinterpretation or ambiguity.
Is the above even possible?
If you mean our human mother tongues, no. That is called French.
Absolutely not. Many implementations all slightly different is actually many different languages. Dilution, fracturing and chaos. The only reason I see for a separate implementation of the same language is to support some obscure architecture that will never make it into the mainline compiler implementation.
Philosophical meanering:
When there are many vendors of widgets, be they nuts and bolts or compilers, it is very beneficial for users and customers to have standards for those widgets. Then the customer can pick and choose their supplier, based on quality, price, availability etc. It ensures things can be made when a supplier disappears.
Back in the dark ages of closed source software there might be many vendors of language compilers. Users could not take the code of those compilers and adapt it to new targets or operating systems. Users were held hostage to their vendor. Unless... unless there was a common standard for the language in use and then users could move their applications from place to place, pick up a new compiler form a new vendor with some confidence things could be made to work. Or create their own compilers. In that world of many vendors of closed source implementations language standards were a good thing for users.
In the modern world of open source whilst it may still be beneficial to have a language standard, or design specification, is it really necessary to have multiple implementations?
It would be far more efficient to have whatever talent is out there, whoever they work for, contributing to the same implementation which is then available to everyone. A single implementation ensures constancy wherever it is used.
I guess there is benefit in terms of verifying correctness with multiple implementations. If multiple teams can produce compilers that work the same from a single spec that will have shaken out a lot of ambiguities in the spec and identified a lot of bugs in implementations. In that way I see the efforts of the GCC guys as a good thing.
I used to be a great fan of things like the ISO standards, for the C language for example. Not so much now a days. On the one hand we have rigorous standards for languages that are never used, Pascal, BASIC, others. On the other hand we have the C++ standard, an endlessly growing mess of complexity that I really don't want to use anymore.
And now image how C++ would look if there would be no standard. Rust and Haskell (GHC2021) and Lisp would all be full subsets of C++20
Why would you want to write Rust in C++ when you already can write C++ in Rust?
C and Python are also just subsets of Rust at this point. Ah, the joy of living through the Great Oxidation!