I wrote Sizedness in Rust because... well, to borrow the intro from the article itself:
Sizedness is lowkey one of the most important concepts to understand in Rust. It intersects a bunch of other language features in often subtle ways and only rears its ugly head in the form of "x doesn't have size known at compile time" error messages which every Rustacean is all too familiar with. In this article we'll explore all flavors of sizedness from sized types, to unsized types, to zero-sized types while examining their use-cases, benefits, pain points, and workarounds.
Please let me know if you find anything confusing, unclear, or inaccurate! Your feedback is very important and helps me improve the article. Thanks!
Nice writeup. Here are three areas that you may wish to revise:
a) One statement early in your writeup could use some elaboration, or at least an asterisk to indicate that it is not completely correct. The following statement does not take into account field alignment constraints, which may trigger inter-field padding and thus increase the size of the struct beyond the sum of the sizes of the struct's constituent elements.
b) The following rationale misses the critical implementation constraint that drives the specification:
The reason that the unsized field must be the last field is that it is the only location in the struct that permits the compiler to determine the starting offset of each field at compile time. That same rationale precludes having two unsized fields in the struct, as one of the two fields would not be "last".
c) This statement is probably inaccurate:
It is almost certain that the compiler optimizes the instances away competely, resulting in no generated code at all rather than any no-op instructions, which each would occupy at least 1 B of code space.
Thanks for the article. It's good to spell all this out as it's a confusing area of Rust, in that what's going on behind the scenes is not that obvious.
If you're talking about difficulties of unsized types, maybe worth mentioning the problem with Option, e.g.: this doesn't compile struct A<T: ?Sized>(Option<T>); yet this does struct A<T: ?Sized>((bool, T));. There's no workaround in safe Rust that I'm aware of. In unsafe Rust I think you can use ManuallyDrop to implement your own Option.
Another tip for DST handling is working around the lack of stable CoerceUnsized. The trick is to get it into the form of a Box<something>, do the coerce into a Box<dyn something> or Box<unsized-something> and then turn that into a raw pointer to do what you want with, and to drop it turn it back from a raw pointer to a Box. That's handy for implementing your own Rc. See a minimal Rc implementation here. Probably I'm not describing this using all the right terms, but slowly I'm building an intuition about how to get the compiler to do what I want with DSTs, but it's so easy to get into a muddle about it all.
I try to target my posts toward Rust beginners working within safe stable Rust so while writing a custom Option DST in user-code is a very interesting exercise it's also a very advanced exercise that falls outside the scope of the article.
Okay, fine. I found your article interesting all the same. Reviewing the first principles is helpful. Unfortunately the most interesting uses of DSTs require unsafe at the moment. Those kinds of uses of DSTs can become very complex very quickly, and get into corners of the language that are not so well developed, so I guess you have to choose where to draw a line.
You write that a slice is a "double-width pointer to a dynamically sized view into some array". This is not quite correct: A slice is a type [T], whereas &[T] is a reference to a slice. Slices aren't pointers. See this SO question and the reference for more details.
Personally I agree with you that it should be that way, but that's not how TRPL uses the term:
A string slice is a reference to part of a String
This slice has the type &[i32] .
I usually try to say "bare slice" when talking about stuff like [T] and str, and "slice reference" / "boxed slice" / etc. otherwise, to avoid confusion. But there is plenty of precedent for using "slice" to mean a reference.
My view is that, because Rust slices exist only as references, [T] is really a conceptual extent (subset) of an array object. One cannot create or manipulate a bare [T], but only describe/delimit it via some form of slice reference such as &[T], &mut [T], or box<[T]>.
For me it's not that one "cannot directly manipulate a DST value", but more fundamentally that one cannot describe a DST value without a fat pointer that specifies both starting address and extent. Thus any [T] must be derived from such a DST by adjusting the starting address or extent (or both) of such a fat pointer to delimit a subset of the allocation from which it is derived.
Ah, right! My origins are from C++, where the vtbl pointer(s) are carried in the struct layout itself. This is, of course, a fundamental difference between C++ and Rust -- C++ imbues structs with dynamic properties (virtual methods, etc.) if you "bless" them with a virtual method; Rust keeps the struct and its dynamic behaviors nicely distinct.