Questions to: size_of_val, ArcInner and DST in generall


#1

Hy,

I did recently read how the from_box for Arc is implemented,
mainly how it is implemented for DST since rust 1.21 (link to std src).

And I’m wondering about a few parts. First is how size_of_val
actually works. (It’s a compiler intrinsic which can determine the
size of a dynamic typed value at runtime).

If I would guess it either magically knows about the two types of
DST pointers (pointer+size / pointer+vtable) or it asks the
allocator about the size of the memory slice allocated in
for the given ptr (or correct the slicee size minus the offset
of the pointer to the start of the slice ). Not sure if the later
is even possible. And if it is the former will there be a
mechanism in the future to “tell” rust the size for
less usual DST’s?.

Also from the way pointers to ArcInner are handled
I guess that a pointers to a “wrapper”/"composition"
DST like ArcInner will automatiacally be fatpointer
of the same structure (and with the same meta data)
than a pointer to the structs only DST field?

Is there a way to tell the compiler “no do not store/keep
the meta information for this DST field and just have
thin pointers to me”. E.g. if I have a lot of references
to a small amount of slices I seldom access moving
the slice length into the slice through a wrapper
type containing a filed with the length and a DST
field with the slice could be a nice memory
optimization.

Lastly how can Layout::from_val work even
through we give it a reference to ArcInner
from a pointer which we “just” casted from
*const T to *mut ArcInner<T>
I mean the fake ptr points to a struct
which is smaller by the size of two
atomic counters and their padding…

Wups, that’s a lot of text so thanks
for reading it and more so for
answering it :wink:

EDIT:
thinks which I think I now how they work now:

  1. fat ptr to a struct containing a DST seem use the same type of fat ptr and the same “meta data” as the DST they contain
  2. size_of_val (and in extension Layout) seem know about the different types of fat pointers and use (only) the fat pointers meta data to determine the size
  3. in extension to 1. this means that the size in a ptr to a containing a slice is the size of the slice, not the struct, is kinda obvious if spelled out, but it explains why Layout::from_val works with a “wrong” pointer (*mut T as *mut ArcInner<T>) as it “just” uses metadata, which will stay the same.

EDIT2:
I wasn’t clear about it, but a lot of the DST talk is about DST which should be possible in future rust but aren’t really now, through then even in nightly rust you can’t create a DST without relying on compiler implementation details (mainly that the first usize integer of a fat pointer is the pointer to the underlying data)


#2

I’d expect it knows about fat pointers. For trait objects, the size (and alignment, for that matter) are recorded in the vtbl. For slices the length is in the fat pointer, and the size/alignment are known by virtue of knowing the concrete T of the slice.

What are “less usual DSTs”?

Not sure I understood this question. ArcInner isn’t a DST itself.

You can thin out a fat pointer, but you’d need to use raw pointers (not references). This is probably not what you were asking about though.

I don’t think Layout::for_value actually dereferences the data - it just looks at the fat pointer for DSTs: https://doc.rust-lang.org/src/alloc/allocator.rs.html#133

You can just as well manufacture your own “phantom” pointer:

let ptr = 1 as *const u8 as *const Arc<str>;
let layout = unsafe {Layout::for_value(&*ptr)};
println!("{:?}", layout);   // will print proper size and alignment

#3

I expect that too but it is a compiler intrinsic and the other solution
can work too, theoretically. So it’s nice to know from someone
who knows the relevant compiler internals.

A less usual DST is any DST which does not fall in the two categories of either being a trait object or a simple slice. E.g. a pointer to a message pack object, through the value of the byte where the pointer points to it can determine which other bytes it can acces in a recursive manner. Because of this you do actually want to have a thin pointer to it, even through it is dynamically sized (note that there is one case for this example where it is useful to have a fat pointer with size which is
when you want to use prt::copy to copy it to somewhere else in
memory, but there are better ways to optimize this edge case then
to fall back to a fat pointer).

There are many other possible DST, which can make sense for
some use cases with might want to have more or less then
two usize space in the pointer. Another example would be
a possible implementation of TraitObjects with multiple
traits storing multiple vtable pointers etc.

It can be! Arc does not require it’s data to be sized, if it is not
sized ArcInner does contain a DST and is by extension a DST itself.

Yup, I’m more interested in the direction rust is heading, if I thin out a pointer I not only need to use raw pointers but untyped ones because pointers to DST’s are fat, too. But what I would want is to tell rust that a &MyDst/*mut MyDst is not a fat pointer even through it’s a DST (or a fat pointer with 3 pointers, through this could maybe cause other problems with code gen I guess ).

You calculate the size+align of Arc not ArcInner
(which is private and a DST so you can’t fabricate pointer ).

An alternate example would be:

#![feature(allocator_api)]
use std::boxed::Box;
use std::sync::Arc;
use std::heap::Layout;

struct Dst<T:?Sized> { x: usize, y: usize, data: T }

fn main() {
    let data: *mut str = Box::into_raw(String::from("1").into());
    let ptr = data as *mut Dst<str>;
    let layout = unsafe {Layout::for_value(&*ptr)};
    println!("{:?}", layout);  //prints a size of 24
}

Which interestingly prints a size of 24 which is the size of Dst<()> (16) + 8. If you pass in an empty string it has a size of 16. And a string with 9-16 chars has a size of 32.

I wondered in my question how the size can fit because a ArcInner<str> is always two atomic pointer sizes larger than the *mut str fat pointers meta information we pass in sais, but I became aware that it always only has to contain the slize of the dynamic sized part and never the static overhead. So the layout just treats the size meta data as the size of the data field and adds in the sizes of the other field (with consideration to padding) to get to the size it returns. Or at last this is what I think happens.


#4

Sure, getting a confirmation from someone in the know is helpful. Rust uses sized alloc/free so allocators don’t need to maintain size information themselves (about the objects).

Ok, so you’re talking about hypothetical new types of DSTs.

Indeed, I missed that ArcInner has the T “embedded” in itself (and not behind a pointer/reference).


#5

So I might be mistaken as I’m unfamiliar with rustc codebase, but perhaps https://github.com/rust-lang/rust/blob/master/src/librustc/ty/layout.rs#L1609 is where the size is computed taking the static and dynamic parts (of a DST) into account. And the meta part of the fat pointer is computed here: https://github.com/rust-lang/rust/blob/master/src/librustc/ty/layout.rs#L1137-L1153

Would be nice to know if this is, indeed, the place or not.