Why the size of *mut [T] is double of *mut T?

On my 64-bit computer:

std::mem::size_of::<*mut [i64]> == 16
std::mem::size_of::<*mut i64> == 8

It confuses me, because pointer to an array is also a pointer.
I think the size of array should be stored in the allocated space, not in the pointer.

Indeed, for unsized types (types for which T: Sized is not satisfied, e.g. slices, dyn Trait, or compositions of these), pointers are "fat" and the additional information (the size of a slice, the vtable pointer of a dyn Trait, etc.) is stored along with the pointer.

This is impossible, because slices do not own their contents. For instance, if you have a subslice pointing inside another slice, there's no place to put the size "in the allocation".

7 Likes

A *mut [T] (and &mut [T] and friends) have a representation equivalent to this:

#[repr(C)]
pub(crate) struct FatPtr<T> {
    data: *const T,
    pub(crate) len: usize,
}

(code for manipulating slice fat pointers in the standard library)

The length can't be stored alongside the slice's data because you need to support sub-slicing (e.g. &some_vec[2..5]) and you wouldn't be able to store the length before/after the data being pointed to because that memory is already used (by the 1st and 5th elements, in that example).

This section from The Nomicon may help explain what is going on:

3 Likes

This results that Box<[T]> and Rc<[T]> are also 'flat', but they own the data.
For example, multiple Rc<[T]> pointers refer to the same space, but all of their flat pointers store the size.
Will it cause some bugs?

Great. If I expose a *mut [T] to extern "C", the program will crash because C program regards as a 8bit pointer? Is that true?

The exact ABI of *mut [T] is unstable. As such, there is a lint that will warn you when using *mut [T] as argument to an extern "C" function. In practice, it is currently represented as two arguments. First the pointer, then the length, but this can change at any time.

1 Like

It won't cause bugs, as slices can't be resized. It is slightly larger than it could be, sure, but that isn't a big deal and is worth it for its simplicitly. It also avoids having to deref just to get the length.

1 Like

Much of the above discussion appears to me to be ignoring the realities of modern hardware. Storing the slice size information with the starting address of the pointed-to slice means that both will be in the same cache line; storing the size information anywhere else will necessitate a usually-costly access to a slower cache or main memory line to obtain that size information.

6 Likes

I guess storing the size with the pointer is good enough because CPU would seek the location pointed by the pointer, which mean the size is probably in the cache along with the pointer.

If the CPU has a cache, as all modern CPUs do, the fat pointer's alignment requirement causes the size always to be in the same cache line as the pointer. That will be the case even if rustc reorders the two fields of the fat pointer (which I don't expect, because some CPU architectures favor having the pointer field be at offset 0 in the fat pointer).

2 Likes
fn main() {
    println!("{}", std::mem::align_of::<&[u8]>());
}

says that the alignment is 8, not 16.

On which architecture? Intel's architectures are more permissive on alignment than most others; 8 is the alignment requirement for most Intel CPUs.

This is on the playground, but I think rustc always layouts &[T] the same as (*const T, usize), which means that the alignment will always be that of a pointer.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.