insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
How does Rust tell whether a type Type<TypeParam> is an indirection to the type TypeParam? Does it have something to do with the trait Sized?
To add to the context,
// this does not compile
struct A(B<A>);
struct B<T>(T);
// this compiles
struct C(Box<C>);
In my mental model, at this stage, rust does not really look inside the actual definition of the depended type (B and Box) except for the trait that they implement, much like a type versus a term when type checking. I would assume that there should be some auto-trait-like mechanism for the compiler to achieve the distinction between B and Box.
What is a pointer for the compiler? Maybe a better way to ask is, how to make a pointer type myself without relying on Box, Rc or &? Or these types are hard-coded to be recognized as pointer-like?
And if you build something that is somehow an indirection without using a pointer (e.g. an index/key to look up the value a global table) then that'll also end up having a constant size. It's not that pointers have special recursion-enabling powers, it's that pointer types like &T or Box<T> are the most common kind of type that can mention some type Twithout having a T stored in-line.
(And the second most common kind of such types are structs that contain pointers.)
Thank you. I would guess this is the case, but I fail to find the place that says something like this list are the types based on which you can construct your indirection types and not anything else.
Thank you. I think I understand semantically why the requirement makes sense, but I can't see how compiler tells if a type is a type without having itself stored in-line.
I try to understand your answer better. Do you mean the compiler will attempt to compute the size and find that the size of type A requires adding itself, and by detecting such cycle it decides to raise the error? And since pointer types are special when computing their size, the compiler will not complain in those cases. Is my understanding correct?
It naturally falls out of computing the size (and layout) of a type. The most basic rule about sizes is: the size of a struct is the sum of the sizes of its fields (plus padding, but we can ignore that for this purpose). For example, if you write
struct Foo {
bar: Option<Bar>,
}
then Foo must be at least as big as Option<Bar>, and Option<Bar> must be at least as big as Bar, because it (sometimes) stores a Bar. Therefore, if you write
struct Foo {
foo: Option<Foo>,
}
then trying to compute the size goes “size of Foo = size of Foo + size of Option discriminant + padding”, which is an equation with no solution but “infinity”, and so the program won't compile. But if you write
struct Foo {
foo: Option<Box<Foo>>,
}
then since the size of Box<Foo> is the size of a pointer — the size of an address in memory, which is just a number. So, computing the size of Foo does not grow infinitely large through recursion — it's just “size of Foo = size of a pointer + size of Option discriminant + padding”.
That's all there is to it. “Consider introducing indirection” is just the compiler explaining a possible solution after it detects infinite-size recursion — it's not that “has indirection” is a primitive concept in the language (for this purpose).
The fact that it's not hardcoded, and that you can write your own indirection-containing types, is why there is no such list. Vec<_> and Rc<_> are (std) library types for example, not language-level types. And anything that contains those would also feature indirection.
The (stable) primitive pointer types (so far) are *const T, *mut T, &T, and &mut T. And arguably Box<T> since it has language-level magic behavior (so far). If you dig down deep enough into an indirection type, you will typically find one of these. But you could technically find something else like a usize or a [u8; std::mem::size_of::<*const T>] or something else that could be used to reconstruct an address.
And there are also other "infinite size" avoiding techniques besides reconstructing addresses, like the "store an index into a global" possibilities @kpreid mentioned.
I think it's true that *const T, *mut T, &T, &mut T, and PhantomData<T> are special in that they're the only constructs where the generic's size isn't part of their size[1]. As a result, they're the only things that can hold type recursion or unsized types. Since you can't have unused generics, the only way a generic type can avoid including the generic's size is if it has one of these indirections. And if a generic has the ?Sized bound, you know it must be behind indirection (but lack of ?Sized doesn't necessarily mean lack of indirection).
I also tried A([A; 0]) and A(A), which shouldn't have any technical limitations, but Rust didn't allow them, probably because they're not useful.
It's probably intended that indirect and direct types can have the same exact public interface, since it lets you swap between them without breaking API[2], but it might be useful for something like rust-analyzer to be able to detect indirection.
After digging into the compiler, this is what I found. What makes Box<T> (i.e., Unique<T>) an indirection by not being considered as a putative part of a cycle is the way they are computing the rustc_middle::ty::adt::Representability via the function rustc_ty_utils::representability::representability.
Long story short, for a generic type to be used as an indirection of its type parameter, we consider types into cases defined by rustc_type_ir::ty_kind::TyKind:
Slice of form [T], which is not Sized so not an answer to my question
RawPtr of forms *mut T and *const T
Ref of forms &'a mut T and &'a T
several anonymous kinds, i.e., the type of a declared function, closure, coroutine, coroutine closure which are unavailable to users so not an answer
FnPtr of form fn(T) -> T
Dynamic of form dyn Trait, which could be an answer as we may put T in the generic type parameters or specified associated item
and all others (struct, enum, union, tuple), that only have as its fields or variants indirection to the parameter T.