The Confessional Thread: Parts of Rust that I still don't get after all this time

I don't touch Higher-Ranked Trait Bounds, otherwise known as for<'a> syntax. I'm not sure I'll ever quite get that.

11 Likes

I have been using Rust for a couple of years. I have managed to build a simulator for our future product in somewhat less than 10K LOC without learning macros and almost never needing explicit lifetimes.

Macros look like they are written in an entirely different language. I like learning new languages, but people around here expect me to get actual work done. Fortunately, I haven't found a place where writing a macro would be a big improvement.

It seems to me that putting a reference into a struct is more contagious than COVID-19. Everything that struct touches needs explicit lifetimes until I've got more ' marks in my code than any other character. Not putting references in structs requires more clone operations than I would like, but that hasn't been a problem yet.

3 Likes

I am still incredibly confused by the following topics:

Undefined behavior

This is really the meat of my rant. Undefined behavior defies explanation, I think, by its very nature. At least the concept defies any attempt I make to internalize it. Being human, I want to believe that UB is fine as long as some-arbitrary-condition is met. I know this is untrue, but I still cannot come to terms with it.

Along with UB is the concept of unsoundness, which is (as far as I know) mostly unrelated but just as serious. Here's a cool cheatsheet!

Which segues into ...

unsafe

Because unsafe allows innumerable ways to invoke undefined behavior, I too have learned to avoid it like the plague (very much the same as mentioned earlier by notriddle). I have even gone as far as making all of my crates #[forbid(unsafe_code)].

When I read articles like you can't turn off the borrow checker, the arguments made for unsafe do give me that warm-and-fuzzy feeling for a fleeting moment. Then I remind myself just how awful it is to write anything at all in C, and it scares me back to reality.

I don't think this is necessarily a problem, but I'm sure it will cause me to miss out on something important some day...

Lifetimes

Reading about lifetimes makes my head hurt.

Pin

This is another tricky subject that is deceivingly simple! The devil is in the details. I can't even follow along with most conversations involving Pin beyond the very basics. Topics like Pin projection and unsoundness when combined with DerefMut blew my mind. There's just no other way to put it.

All the parts of Rust that I haven't found a need for in my day-to-day

5 Likes

I'm lucky enough that I understand most of the tricky parts of Rust other people stumble on. I guess it comes from an academic background and a decent intuition.

But the type-level magic that frunk accomplishes, I don't think I'll ever fully understand. There got some great blog posts about type-level recursion and the black arts powering frunk. I love reading these; it's a fun walk through (ab)using the type system to do awesome things.

But I don't think I'll ever be writing something as involved as frunk, diesel, or libp2p. I've written some absurdly generic code on a smaller scale, but the scale at which the above crates operate is an order of magnitude beyond where I can struggle though compiler errors on my own power.

Sure, I can probably wrap my head around it after the fact, but writing it is another story. (Let alone extending it in useful ways for writing generic APIs.)

I'll be sticking to my needlessly involved (proc) macros over needlessly involved trait impls, for the time being.

I thought up a clever qotw bait one liner to stick in here that prompted me to actually write it then forgot it while writing the post in favor of being genuine... whoops

12 Likes

Most of the stuff cited so far I "get" at least on a basic intuition level, but a lot of that is because I don't use Rust in my day job and encounter all of these concepts only via posts/articles designed to explain them in detail, rather than from the "this is why your code won't compile" direction that's typical on this forum. Plus, quite a few of them are familiar from other languages (e.g. I know how "variadic templates" work in C++, so "variadic generics" explains itself).

But there's one exception, which nobody else mentioned yet:

Hygiene.

For those who've never even heard that term: I know it has something to do with whether, for example, in let x = 2; magic!(x); the macro is allowed to generate code messing with the same x variable, or whether it'll end up operating on some other name. I suppose it's like the macro-expansion-time equivalent of variable/name scope? Ish.

4 Likes

Implementing future Primitives

I've been watching the whole async-await thing for a while now, but still don't know how you'd go about creating the fundamental building blocks.

For example, something that'd be really cool is to write embedded code which can await until a certain input reaches a desired state. That'd massively improve the readability for code on microcontrollers, but I have no idea where I'd even begin with implementing it...

Pin

Couldn't have put it better myself.

I was just bitten by this when I thought Pin<Box<str>> would magically ensure my Box<str> doesn't change. However because of impl<T> Unpin for Box<T>, you can swap out the Box<str> and leave some unsafe code with dangling pointers.

Advanced Type-Level Shenanigans

I know Rust's type system is turing complete, but seeing crates like typenum or the previously mentioned frunk still blow my mind.

I remember reading through an article where the author implemented brainfuck at compile time and things like using traits to iterate over type-level zipper lists blew my mind.

Concurrent Data Structures

Writing concurrent data structures feels like black magic. How can you possibly reason about something when another bit of code might change pointers out from under you at any time?

1 Like

Stuff I completely don't understand despite multiple tries:

  • Higher ranked trait bounds (and kinds in general), GATs
  • Executors, tasks, and how they relate to futures
  • Object safety: when can a trait not be used for trait objects?
  • Pin (specifically how it actually works)
  • Default match bindings: I still have no idea why the compiler chooses to borrow sometimes and move other times.
  • The structure of Cargo.toml (appart from dependencies)
  • Drop order: when do things actually get dropped... so many deadlocks caused by messing this up. Usually I end up adding extra curly braces everywhere to make sure they are dropped.

Stuff I understand conceptually (I think), but never seem to work as expected when I use them:

  • Trait objects
  • Many of the iterator combinators (I can't seem to combine them properly except for map and filter)

Stuff I've never really tried:

  • Proc macros
  • async/await
4 Likes
  • Cow
  • Pin and its mystical-magical macro wrappers (all 8 of those crates).
1 Like

Each time I see someone ask about async and state machines, I'm always all like "omg!, omg!, omg!, I can explain this so clearly, because I've been working so much with libevent. See, you know those complicated state-machines you always have to implement in read and write callbacks? Well, with async you don't need to do that any more!".

... and that's when I realize most people haven't actually used libevent, so then I realize I have to explain that as well and then I realize I'm about to hold a seminar about libevent just to explain why async is so fantastic.

Hidden for readability

A few weeks ago I started writing a blog post titled something akin to "non-blocking I/O with libevent, state-machines and the async future". The whole point is to explain why one ends up writing (annoying) state machines with non-blocking I/O, and how async helps immensely. When time permits I'll finish that, but until then...

Say you have a reader callback function that is called each time you've got new data from the network. Your protocol is initially line-based, and the first line is a special command, and the following lines are parameters related to that command. And the command is terminated by an empty line (sort-of like HTTP). Your reader will need to read lines as it gets new data, return to the dispatcher if it needs more data and when it enters next time keep track of which line it is on (is it the first command line, or a parameter line?). Also, another fun thing one sometimes forgets is that the reader can be fed multiple lines at once, so one can't simply process a line and then return to the dispatcher (unless it's a level-triggered callback), so one needs to keep iterating over complete lines within the read callback. I've written plenty of these handlers over the years, and I still managed to make such a mistake just the other day.

And now let's say that after the line-based protocol you can get binary chunks, and each chunk can determine what type of binary chunk comes next -- or if it should return to the line based protocol again. You may even end up needing to keep a stack of states (for nested protocols).

State machines are beautiful tools for these kinds of things, but once you've done a few of them you realize how error-prone they can be when you make them manually and new people fiddle around in them without properly understanding all the state transitions. (Somewhat helps to implement states as functions and function pointers rather than enums.. But still not perfect).

Aaaaanyway, async helps in the sense that it builds those state machines you'll inevitably end up implementing anyways, but they do it implicitly. But while that's a big part of it, I think what people don't realize is that they also help protect against common traps like the one I mentioned above where people return too soon from a callback. That, to me at least, is almost as important -- because those types of problems can be hidden for a while.

Welp, sans examples that's essentially what I wanted to say in that blog post.

3 Likes

Never before have I read so much about something and literally not understood a single word of it. At this point I'm convinced that there's a protocol incompatibility between the learning center of my brain and the fundamental concept of GATs.

2 Likes

Since it seems multiple people have a hard time understanding GATs, I'll post a link to what helped me get them to click.

https://github.com/rust-lang/rfcs/blob/master/text/1598-generic_associated_types.md#background-what-is-kindedness

The RFC goes into the specifics about how it'd be designed, but the linked section explains what it is.

I personally find them fascinating though.

8 Likes

That is all far to "meta meta" for me. Hope I never have to read any code that uses such things.

I confess: the baggage of knowledge from other languages can be counterproductive. I didn't know about "variable name shadowing". In every other language (that I learned) this would be a clear amateur typing error. After months of Rust coding I learned to love "variable name shadowing". But the feeling that it can produce nasty bugs cannot leave my soul. I still don't get it why we don't have a special syntax for shadowing. Something different from initializer "let". Maybe "lets" could be the short "let shadowing". It would calm my soul to have the possibility to explicitly express my intent of shadowing. No nasty bug possible in that way. Thanks for listening my confession.

7 Likes

That's actually sortof interesting. It seems that shadowing is the simplest feature of Rust that seems plausibly easy to misinterpret. Wondering if someone who has been around longer knows why a different keyword wasn't chosen for shadowing, or if there's a thread somewhere to read more about that.

2 Likes

This has of course been discussed at length over the years:

Personally I like shadowing in Rust. It has proved useful. It's unlikely to cause bugs that don't show up almost immediately, at compile time or first testing.

I loath the the idea of adding more keywords and language complexity for such non-issues.

5 Likes
1 Like

Hmm, I think I have stumbled upon something I thought I understood but it turns out I don't (cc @ExpHP: we discussed about this a long time ago): the meaning (and thus variance) of + 'lifetime in trait objects.

That is, the following program does not compile (it would be unsound if it did!):

  • EDIT: I am not so sure anymore that it would be unsound :sweat_smile:, quite the opposite actually
trait Trait {}
const _ : () = {
    fn check<'short, 'long : 'short> (
        it: &'short mut Box<dyn Trait + 'long>,
    )
    {
        let _: &'short mut Box<dyn Trait + 'short> = it;
    }
    let _ = check;
};

i.e., type T1<'x> = &'fixed mut Box<dyn Trait + 'x> is not covariant;

but

trait Trait {}
const _ : () = {
    fn check<'short, 'long : 'short> (
        it: &'short mut (dyn Trait + 'long),
    )
    {
        let _: &'short mut (dyn Trait + 'short) = it;
    }
    let _ = check;
};

does compile fine, meaning that

type T2<'x> = &'fixed mut (dyn Trait + 'x) "is covariant" ??

  • Despite trying to, I haven't been able to exploit that unexpected (co)variance into something unsound;

I'd never have expected Box / &_ (covariant) indirection to change the variance of a type, so there is clearly something I'm missing w.r.t indirection and trait objects' attached / inherent lifetime bound.


If somebody is also interested / intrigued by this, feel free to start a thread where we can discuss about it.

7 Likes

Huh, I'm not sure if I recall that conversation, but that's positively bizarre. I would definitely expect both to be invariant.

Yeah, if we can open a thread for that I'd like to take a stab at exploiting unsoundness with it...

2 Likes

I love this thread. :slight_smile: In some ways it seems kind of cool how Rust itself is like a living science experiment where tons of people can contribute to it from all ranges of completely clueless newbies, to experts in one thing and clueless in others, to a complete expert at everything ( if there's such a person, but I doubt it! ).

Rust is safe enough, but not too safe at the same time. I don't know I've ever seen a better balance in computing.

Anyway, lets see, what I don't get in Rust yet...well I'd have to say FFI is a big one. That's kind of a direct result of not knowing C or C++ at all and not having an great understanding of all the rules not to break to keep unsafe code sound.

I think I'm perfectly capable of grasping that stuff, I just think it would take some time to get used to the dark arts necessary to soundly compose unsafe code. One day it will probably be worth learning, but it hasn't been yet.

I essentially get all the most major points of Rust including traits and trait bounds, generics, lifetimes, async, etc., but not super deep in. Once it gets to a certain point for each subject, it goes into the black magic realm for me.

Any time I need to get a little deeper to make something work, I open up a forum topic and usually some guru that gets the topic helps out until I usually go a little outside my initial comfort zone and I leave knowing more than when I started.