Is there a plan to get a formal Rust language specification at some point?

Just a note: As of late September, there is a Grammar Working Group whose purpose is to develop a canonical grammar for Rust, based on both existing documentation and grammar-related files within the rustc compiler. The repository where this will reside is https://github.com/rust-lang-nursery/wg-grammar.

5 Likes

Am I the only one who’s thinking the RFC Repository/process also qualifies as part of the spec?

In there, we the community discuss and specify the expected behaviour of Rust.
I guess it is spotty, because the “ancient” core of rust was developed before the RFF process existed, but all new changes/updates follow it.

Of course there is the gap between when an RFC is accepted but not yet implemented. I guess rustc is non-conformant in that time?

I would think a formal spec would only entail stable language features. (You may even want to define stability in terms of presence in the spec.) The RFC process would then be used to determine what goes into the spec, not vice versa. You certainly wouldn’t want to force everybody who aims to meet the spec to adopt any part of the RFC process (or in fact, any organizational processes), because carrying out such processes might itself prevent compliance.

1 Like

There is also the fact that some changes are considered too minor to require an RFC so just use a PR. However, from the standpoint of a safety assessment agency such as TÜV, they might be consequential.

1 Like

RFCs are explicitly not a specification, but a consensus building mechanism. Sometimes things change post RFC but before an implementation. Sometimes an accepted RFC is never stabilized.

3 Likes

Note that, officially, the RFC repo is a consensus-building tool and not a record of how things work. It’s expected that RFCs are not updated for tweaks that happen over the course of implementation – for example, RFC #141 (from 2014) says that trait generic parameters in impls are output positions, but then associated types got added, and it was implemented with trait parameters as input positions (and stabilized that way).

1 Like

Alright I see the RFC process takes a slightly different position than I originally thought!
Thanks for the feedback everyone! :heart:

1 Like

I think the term “formal language specification” may cause some confusion, because in the programming language research community, that means a rigorous mathematical description of at least the semantics (run-time behavior) of programs in the language, plus (usually) rigorous mathematical rules defining which programs are accepted by the compiler (e.g. type-correct). C and C++ have language specifications that aspire to be precise and complete, but they’re not “formal” in that sense.

The Rustbelt project, if fully realized for Rust, would entail producing a true formal specification for Rust as just part of its work. You’re really asking for an informal specification; indeed, that would be a lot less work.

4 Likes

So do what Web standards do: when the specs don’t match reality and you can’t change reality for compatibility reasons, change the spec to match reality. The vast majority of developers aren’t even going to read the spec, but a spec is still valuable so that when someone finds rustc diverging from the spec, they know there’s a bug somewhere.

3 Likes

Thanks for your response and the clarification. I think you’re spot on. I may have muddied things a wee bit but I think it was an useful pot stirring exercise all the same! :slight_smile:

Like I said - my perception is that ‘serious’ deployment of Rust requires the availability of a spec of the sort associated with C/C++ (eg like https://isocpp.org/std/the-standard). I’m unfortunately not qualified in the dark arts enough to classify that kind of specification as formal or informal but your articulation seems sensible to me - so informal it is.

I think what I’m coming around to is a realisation that the closest incumbent is the Rust reference, as @steveklabnik has pointed out several times.

Perhaps there is then a question for someone who understands the effort involved in making a language specification palatable for productisation (especially safety sensitive product domains like Auto). The question is: What is the gap between the Rust reference as it stands and what would be considered acceptable by such domains ? I am hopeful that this gap is the one that can be made smaller by chats between folks such as @steveklabnik, perhaps, and some of the compiler architect types at Arm. This is kind of important because Arm’s pervasiveness implies stronger security and safety guarantees and anything we can do to provide better options for the ecosystem is likely to pay off eventually. Rust’s clear technical merit in safety critical domains should not get overshadowed by the lack of an acceptable specification methinks.

Having read a little bit more about the Rustbelt project - and having digested @rocallahan’s clarification, I think that it’s awesome and all power to that community. I do think however that the outcomes for that project understandably will take a lot longer to realise. Getting to a more product friendly reference should ideally be done in parallel and hopefully in shorter time frames.

Thanks.

6 Likes

Rust should have a specification, or at least one planned, IMHO. Right now, because Rust is a young language, there’s only 1 compiler, but like basically all open - source programming languages, people WILL start creating other compilers. In fact, one of the reasons WHY there aren’t other compilers for Rust is partially caused by the lack of a formal specification.

I’m not saying that Rust needs a specification now. Before Rust was at it’s 1.0 release, it made sense to not make a specification because breaking changes can (and do) happen. However, it needs one in the near future.

Also, is Rust geared more towards replacing / complementing C++, or replacing / complementing C? Hopefully something more like the latter.

Is this speculation, or a claim from compiler writers who have explicitly stated that this is what’s holding them back from writing a Rust compiler?

If the latter, that’s very valuable information; otherwise, I’m not inclined to put much weight on it. As noted above, historically, the situation tends to be reversed from what you’ve described: the C and C++ standards were created in response to an existing multiplicity of implementations.

I’m not sure I see what the relevance of this question to the topic of specifications; could you please clarify what you’re asking?

1 Like

Yeah, the first point is mostly speculation, but a few historical details for previous languages shows something like that (also blog posts have referenced this problem FOR RUST, but I can’t find it now. I’ll post it if I find it). Yeah the 2nd question isn’t really that on topic, it was more of a side question that didn’t deserve a new post.

IMO, given the changes going into the 2018 edition of Rust, writing a second compiler implementation at this point is somewhat worse than copying a new Formula I race car design while it’s still being machined and modified during trials. Creating a second implementation will become more feasible when the rate and extent of changes diminishes.

Personally, I view Rust as geared toward replacing close-to-the-metal languages in which malware-exploitable bugs occur with high likelihood. That includes C and all its various derivatives, not just C++.

2 Likes

Yeah. That’s why I said sometime in the future :slight_smile:

It may well deserve its own post! That would give you space to clarify exactly what you think the difference would be, from a practical standpoint.

A language reference or an implementation cannot simply be provided as a formal specifiction.

A formal language specification is a mathematical object you can reason about with absolute certainty by mathematical proof. In case of a hypothetical formal safeRust subset, abstract automata (compiler+virtual machine) are to be stated, that can be shown (by mathematical proof) to never reach undefined behavior (which is also formally defined), given the source code is accepted by the compiler, i.e. fulfills the specification.

In physical reality we may never be sure about it, as the hardware might have a bug, might be damaged, or be disturbed by electromagnetic radiation.

That said, a formal specification is more like a circle or a straight line. They are defined exactly and have properties outside of the physical world.

To be honest, it would be best if the proofs would itself be formal in order to lower them to something like Metamath.

Interestingly, you can regard the source code itself as something like a theorem or proof and the compiler as a theorem prover or proof checker. If everything works right, the correctness of billions of lines of code is reduced to a very small set of axioms. Running the proof checker and compiler many times and on totally different hardware, it can be made almost certain that a given source code is correct.

In distant future, people will not be uncertain about software, but about hardware. You cannot look inside to convince yourself, there is no backdoor, vulnerability or damage.

The design of editions is such that these changes minimize burden on compiler authors. This shouldn’t pose any significant additional challenge.

1 Like

Indeed; the language team has been careful, or at least attempted to minimize technical debt for Rust 2018.

If/when we produce a formal semantics (type system, operational semantics, syntax, etc.) for Rust, then presumably that would be formally defined and proof mechanized in a theorem prover such as Agda or Coq.

However, as Simon Peyton Jones noted, producing a formal semantics is not without risks or drawbacks. There’s a risk that the language would become like Standard ML where they simply cannot evolve the language further because that would require re-doing some of the proof work at every stage.

PS: This topic is really out of scope for users.rust-lang.org and belongs on http://internals.rust-lang.org/.

1 Like