I've just finished the first draft of my thesis chapter introducing Rust. I'd appreciate any feedback you might have, especially regarding accuracy-- My advisor isn't familiar with Rust and so might not notice technical errors.
I've only read up until page 27 and then skimmed the few pages left.
This seems like a very good introduction to Rust. I think I could give the first few pages to some non Rust programmer friends and they'd get a good idea of what it is and what it does. Maybe going a bit too much or not enough into the actual syntax of Rust ? That depends entirely on if your advisor cares about that ?
The primary goal is for readers to be able to understand the rest of the paper, even if they've not really encountered Rust before. As it turns out, there's surprisingly little actual Rust code in the paper. Instead, it's a description of several dozen types and traits with complex interactions.
There is some stuff in here that's not strictly necessary for understanding my project. I'll trim those back if I run out of page budget.
Edit: Perhaps "overview" is a better description of the goal here than "introduction"
What are the Rust concepts you need the reader to be familiar with in order to understand the rest of the thesis?
The big ones are:
- Traits and associated types; generic bounds
- Trait objects vs static dispatch
- Lifetimes and ownership
The rest is essentially tier-2 information that can probably be picked up from context if necessary. I should also touch on iterators briefly, but I plan to do that in an appendix summarizing the relevant stdlib traits I use.
Chapter 3 is a description of the domain-specific language I wrote for expressing complicated type bounds, primarily set algebra on HLists.
Chapter 4 describes a type-safe relational algebra interface for Rust collections, with compile-time query planning
Chapter 5 is a comparison study of performance & maintainability of my interface vs. coding directly against the stdlib collections
Chapters 6+7 are future directions and conclusion.
This is in general a good high-level introduction to Rust and it's core features. I think I struggled a bit with some of the "describing technical features in just English" aspects, but of course I'm coming from a place of understanding the technical bits. Maybe it could use another example or two.
I think you should solicit feedback from people unfamiliar with Rust as well, if you haven't already. A lot of my notes below are of the "well technically" variety, and can perhaps be ignored given the intention of this work. On the other hand, you did ask about technical errors, so I've left them in.
(The self-inconsistencies should be addressed regardless, naturally.)
Proof-reading notes I made as I read
Presented with a situation outside these guarantees, they will often behave as the programmer might expect. When operating outside these formal guarantees, however
Change the first 'outside' to 'inside'.
In general, all change rights are held by the library authors, who then determine the freedoms to allow their users, which necessarily limits the potential directions the library can evolve in.
I'm not sure what you're trying to convey with this sentence at the end of your paragraph explaining traits. If I have "all change rights", surely I can evolve in any direction I like?
All safe methods to construct a String, however, ensure that this vector contains valid UTF-8 encoded data
Given the high level of this chapter, it's probably too nuanced of a distinction to bother changing this example, but non-UTF8 in a
String is not insta-UB.
The class of types with a particular memory layout is represented by a
Structures (struct) are explicitly-named types with a common memory layout.
Both structures and enumerations can be defined generically. The compiler monomorphizes these definitions: Each unique combination of type parameters used in the program produces a different type with its own memory layout.
I was going to suggest finding a less overloaded phrase than "memory layout" when I saw the first sentence, because it invokes things like size and alignment in my mind -- properties that differ for a generic type, for each concrete instantiation. Reading further along to the middle and last sentences, though, this needs a little more care overall as you are contradicting yourself.
You use the term some more later on talking about
dyn Traits and DSTs which implies "memory layout" does include things like size, so you probably do need two different phrases.
Generating a borrow then produces one of two primitive types that array a special kind of generic parameter,
Typo around "array"?
You use this phrase a few times but never define it. I assume it's defined earlier or known contextually. Mentioning this here rather than as a footnote because...
It is important to understand that the seemingly concrete type &'a str, for example, is really a generic type with a free parameter 'a and therefore represents a typeclass. The value of this parameter is some concrete lifetime, attached to a particular activation record.
I guess this is true conceptually? It's not analyzed this way for a function parameter for example, the particular activation record (binding declaration scope?) doesn't matter. I don't know, something about that sentence just seems off to me, but I'm having a hard time articulating why. I guess it is somewhat covered when you later say:
Each invocation of a function conceptually creates a new, distinct, set of lifetimes.
But I don't know that it paints a very clear picture overall.
There is only one named lifetime in Rust: 'static
"only one non-generic lifetime" perhaps?
A brief discussion of these APIs will both elucidate how the lifetime system works in practice and provide an introduction to some of Rust's common utilities. These facilities fall mostly into two different categories:shared ownership and interior mutability.
You don't discuss interior mutability after this.
I could probably pick nits around the macro section, but there's no reason to at this high-level.
Thanks for the detailed notes; those are all definitely things that I should fix.
In any other venue, I'd probably just say "stack frame" instead, but "activation record" seems preferred around here. It refers to the area of memory where a function stores its local variables, even if that happens to be on the heap somewhere.
Oops; that's a big catch-- Now to decide if I write a few paragraphs about interior mutability or remove the mention...
I should probably rewrite or cut this entirely. The intention was to talk about stability guarantees: User code can only rely on explicitly declared properties, so the implementor is free to change anything else without breaking users' code. These properties can be either declared or withheld in a pretty fine-grained way, which lets the implementor fine-tune the design space they have to work in.
I need to clean this up, but "outside" is correct: The point I'm trying to make here is that violation of the guarantees often appears to work, but can suddenly break with seemingly-unrelated changes.
Here are my notes
Presented with a situation outside these guarantees, they will often behave as the programmer might expect. When operating outside these formal guarantees, however, small external changes may cause a program to produce incorrect results without warning.
The contrast seems misplaced; I believe you intend that ...”outside the spec the code can often behave as expected, however, not always. Small external changes...”
Generic programming in Rust relies on traits,
Is that true? Is
Vec<T> not considered generic? In Haskell type classes are “ad hoc” polymorphism; in other words another source of polymorphism.
in the last section: bounds for expressing relationships between types
In general, in this section am I missing your articulating type validation. What I mean is types such as MyType, String, u32 etc. this type system is separate from traits. I read “bounds” as a trait-specific requirement for a given type. But it does not include the types themselves.
Great work, I think it's a very useful and clear summary of Rust.
There's a minor typo in the last paragraph of section 2.1.3: Jung, et. al. Jung, et al.
In the first paragraph of Section 2.2.4, I think you could already emphasize that the difference between GC and lifetimes is that lifetimes are purely compile-time. This is hinted at later, but I think it would be a useful and important addition to the few introductory sentences too.
After section 2.2.5, where you go into what
'static is and is not, you could elaborate on this problem a bit more, perhaps also stressing that there are two kinds of lifetime annotations,
&'a T and
T: 'a, and that those mean very different things. I think even the existence of these two – conceptually, if not technically – distinct kinds of lifetmes is not yet apparent from the preceding description of the lifetime system.
There are two – I think – superfluous capital letters in the last section: "[…] must be entirely self-contained: There is no mechanism […]" and "[…] exploited by the optimizer: Multiple reads from […]".
I think macros by example also go by the easier to conjugate name of "declarative macros"; you could consider using that if you ever find "macros by example" too long or elaborate to use in some contexts.
That's a very good idea; I should probably be explicit about the different forms of bound and what they mean:
Hopefully I can do that without diving into the rabbit hole that are HRTBs.
The style guide I'm following treats
: as an end-of-sentence mark, equivalent to
I allude to this a few times, but should probably say it explicitly somewhere. The last section is intended to be a discussion of the various ways to perform arbitrary computation at compile time, so I don't know if that's the right place. On the other hand, I talk about the optimizer there, so maybe the focus has shifted a bit since I started writing it.
I would just comment generally on the lack of references, I'm not sure what level your writing at. A lot of the comments I have are related to un-sourced general statements you are making.
For example I find it questionable that it's entirely accurate to describe Rust as a "relative newcomer" without being more specific as to why that's important in the absence of a supporting reference. Etcetera, etcetera throughout the document.
The lack of references could certainly be an issue, but my advisor doesn't seem to be too concerned about it. Probably because this chapter is general background information, and not really part of the thesis' core argument. I'll spend some time hunting them down over the next few weeks, after the rest of the first draft is complete.
I've submitted early research proposals in which every line was referenced so I may be on the far end of the scale.
The "for further background on this subject, please see my other writing" is definitely an ever-green.