Reference exceptions questions

I have a set of rules that surround references and I would like to make sure that I'm correct about all of them:

  1. Standard references are the same as raw pointers, they just offer static analysis with their lifetimes and therefore comply with the following depending on the system's architecture
assert_eq!(std::mem::size_of::<&T>(), std::mem::size_of::usize());
//where T: Sized
  1. A reference to a trait, also known as dynamic dispatch is twice as large due to it having both the data and the vtable
  2. A reference to a slice is also an exception, in that it has the data pointer, and the length of which it points, but the slice itself isn't usually a struct in and of itself, instead more like a smart buffer pointer (so an &[T] is kind of like a &&[T; size] but size is unknown so it's [T]: !Sized) and actually doesn't represent the data itself, instead representing the pointer and length of said data
  3. str and &str either follow the same rules as [T] and &[T] or they're magic because of unicode

I'm sorry for the verbosity of these questions, but I just wanted to be reassured that I was correct in how the memory internals for rust and its references work. Also, are there any more exceptions in rust pointers/references which don't follow rule #1?

This isn't strictly true, since references have additional guarantees given to LLVM when generating assembly. The compiler makes the assumption that two mutable references to the same data cannot exist, and will compile and optimize code with this assumption in mind - similarly, it assumes no immutable references to data can exist while a mutable reference exists.

The memory representation of references is always the same as raw pointers, but it isn't just static analysis. The difference does affect code generation.

If T is Sized (it could exist on the stack) then &T, &mut T, *T and *mut T have the same size as usize. If it is not, then &T, &mut T, *T and *mut T all have the size of (usize, usize).

  1. A reference to a trait, also known as dynamic dispatch is twice as large due to it having both the data and the vtable

Correct, but I believe "dynamic dispatch" usually refers only to the action of calling a method on a trait object, it's not the name of the trait object itself.

  1. str and &str either follow the same rules as [T] and &[T] or they’re magic because of unicode

They follow the same rules, but provide additional guarantees. str is a wrapper of [T] with the additional guarantee that data inside is unicode. This is special because unsafe code is allowed to assume that all strs contain valid unicode, and invoke undefined behavior if they don't. Technically the compiler can also make this assumption, but I don't know any place that it does.

Also, are there any more exceptions in rust pointers/references which don’t follow rule #1?

The only exceptions I know of are unsized types. One category of those is trait objects - &Trait. The second is, as you mentioned, [T]. The third is any structure containing one of the above.

For instance, take this:

struct X {
    a: u32,
    b: u32,
    c: [u32],
}

Structures like X are very hard to instantiate in rust, but if you manage to then a pointer to one will also be double sized.

A more common example would be RefCell surrounding an unsized type. &RefCell<[u32]> can exist and will be two-pointers wide, like &[u32]. str can also be thought of as a wrapper for [u8] as I mentioned. It isn't defined as one since it's a compiler primitive, but you could define your own:

#[repr(transparent)]
struct MyStr {
    inner: [u8],
}

and after writing enough methods it could behave just like str.


Also: glad to answer. Feel free to reply with further things!

3 Likes

As an aside, I believe (if I’m not mistaken) Rust 1.29.2 (and onwards up to today) had to turn off noalias LLVM attribute emission again due to, also again, an LLVM codegen issue related to it. It seems this is having a hard time sticking around :frowning:.

2 Likes