Summarizing built-in types

I learn best by summarizing what I learn.
Here is a diagram I created to organize the built-in types.
I'd love to get feedback on this.
Is there anything wrong with the way I categorized the types?
I know I didn't include all the built-in types, but did I omit any commonly used types?https://mvolkmann.github.io/blog/assets/rust-types.png

9 Likes

Very nice!

One correction: I would not put Cell and RefCell in the “smart pointers” category. They don’t use indirection and they can’t be dereferenced. I would put them in a separate “shared mutation” category along with Mutex and RwLock.

One possible addition: Vec dereferences to a slice, so it could also be included in the “smart pointer” category (similar to String).

Some types that were omitted: References, raw pointers.

Some notes on the uncategorized types:

  • Along with struct you could have enum and union. Maybe these all belong in the “compound” category?
  • In the type system, methods are really just functions. Though there are some differences at the language level.
  • A trait isn't exactly a type, so it is a little bit of an outlier here. But for object-safe traits, dyn Trait is a type, so there is definitely a place for traits on this list.
14 Likes

Thanks so much for the feedback!

I assumed Cell and RefCell where smart pointers because RefCell is discussed in the "Smart Pointers" chapter of "The Rust Programming Language" book, but clearly that doesn't mean they are. :wink:

Is Vec the only built-in collection that can be considered to be a smart pointer?

I got the impression from the book that the term "compound type" was very specific to only include array and tuple, but maybe that's not correct. Do you think it includes struct, enum, and union?

I included trait as a type because when it appears as a trait bound, it kind of seems like a type. But maybe that's misleading.

1 Like

TBH, it doesn't make any sense to classify based on terminology without any context of what you intend to use it for. There's no point in saying something is a 'compound type' if you don't make it clear why it matters or what it means. And once you properly define it, it should be obvious what it applies to. Same with 'smart pointer'.

3 Likes

It discusses Cell and RefCell in that chapter because the cell types are very often used together with the smart pointer Rc.

1 Like

I'm just using the same terminology that is used in the "The Rust Programming Language Book". I clarified that in a new version I pushed out. See https://mvolkmann.github.io/blog/assets/rust-types.png?v=1.0.0. I'd love to get more feedback on this. Are any of my classifications wrong? Did I omit any important types?

Perhaps it's getting too much into the weeds, but I would create a distinction between threaded and non-threaded shared mutations. Also you could add Fn, FnMut and FnOnce as children to the closure node.

Kind of wish I would have has this graph when I started using Rust. :slight_smile:

Edit: Ignore that first part -- distinguishing between theaded/non-threaded would be a pain.

I think, since you are (as does the book) counting RefMut as a smart pointer, you should include Ref under “smart pointers” as well. And maybe calling them cell::RefMut and cell::Ref can avoid potential confusion, since you won’t see these types very often used explicitly and unqualified, though this might be a matter of personal taste. Similarly, you could/should then also include MutexGuard, RwLockWriteGuard, and RwLockReadGuard under “smart pointers”.

2 Likes

Fn, FnMut, and FnOnce are traits. Do you think I should include all the built-in traits in the diagram? It might get too crowded, more than it already is, if I do that.

There's something in common for all of these though, not sure what to call it, something like scope-limited smart pointers or "smart references". In most other cases we say "smart pointer" we usually mean a (shared) owning smart pointer I think.

Maybe just a “guards” sub-category of “smart pointers”. The book certainly calls them “smart pointers”, e.g.

We’ll cover the most common smart pointers in the standard library:

  • Box<T> for allocating values on the heap
  • Rc<T> , a reference counting type that enables multiple ownership
  • Ref<T> and RefMut<T> , accessed through RefCell<T> , a type that enforces the borrowing rules at runtime instead of compile time

Are you suggesting this hierarchy?

  • smart pointer
    • String
    • Vector
    • guards
      • Arc
      • Box
      • Rc
      • Ref
      • RefMut

Arc, Box and Rc are not guards. The guards are Ref, RefMut as well as MutexGuard, RwLockReadGuard and RwLockWriteGuard

1 Like

No, I was suggesting

smart pointer
├── String
├── Arc
├── Box
├── Rc
├── guard
│   ├── cell::Ref
│   ├── cell::RefMut
│   ├── MutexGuard
│   ├── RwLockWriteGuard
│   └── RwLockReadGuard
└── Vec

I was referring to the fact that the book calls Ref and RefMut “smart pointers”, too.

2 Likes

A new version of the diagram is up at https://mvolkmann.github.io/blog/assets/rust-types.png?v=1.0.3

1 Like

My point is that the classifications aren't really useful, so they can't be right or wrong. Imagine if you were trying to teach a child to read, and they are asking for clarification on whether 'P' is classified as a round or pointy letter. That kind of distinction may have been useful for a day or two to teach them how to draw letters, but once they start trying to resolve subtle ambiguities, they're far past the point where those classifications are going to carry their weight. You don't need to correct the introductory model, you need to move past it and use something else instead.

A better set of classifications would be these as you can actually use them to write code.

1 Like

As someone that just started learning Rust a month ago I find the categories very useful. I suspect that others new to Rust would as well. For example, someone coming from the another language might know the want a number variable, but they don't know what number types Rust supports. Also, maybe they want to put data into some kind of set or map. Looking at this diagram they would see the options of HashSet, BTreeSet, HashMap, and BTreeMap. I was not familiar with the concept of smart pointer guards until others suggested I add them to my diagram. Now I'm aware of those options.

2 Likes

Another interesting classification arises from how type constructors behave with respect to dynamically sized types. Ordinary pointer type constructors turn dynamically sized types into sized ones. In formal terms,

type Pointer<T> = Box<T>; // Rc<T>, Arc<T>, *const T, *mut T

type Pointer<'a,T> = &'a T; // &'a mut T

Pointer<T> where T: ?Sized

Pointer<T>: Sized

The size of a type is determined by size_of. Therefore, this does not compile:

fn main() {
    println!("{}", std::mem::size_of::<[u8]>());
}

But this does:

fn main() {
    println!("{}", std::mem::size_of::<Pointer<[u8]>>());
}

So, if one insists on considering Vec<T> as a pointer(-like) type, it must at least be a different kind of pointer, because

type Pointer<T> = Vec<T>;

won't work.

@Finn A Vec<T> is a smart pointer to an [T], which is indeed unsized. The reason T must be sized is that an [T] requires that T is sized.

This is the first version I looked at. Some quick thoughts:

  • Not sure what your primitive/non-primitive distinction is
  • Enums and unions are related but far apart
    • Not that I'd suggest a beginner use unions unless they absolutely had to...
  • OsStr, Path, and friends are useful if you interact with the filesystem, env, etc.
  • The Deref connection between references and various types isn't drawn out (though it might be messy)
    • str/String, slice/array/Vec
  • Maybe throw dyn in there somewhere
  • Maybe throw function pointers in there somewhere
1 Like