What does Rc<[T]> map to at the RcBox layer?

RcBox looks like:

#[repr(C)]
struct RcBox<T: ?Sized> {
    strong: Cell<usize>,
    weak: Cell<usize>,
    value: T,
}

What does Rc<[T]> map to at the RcBox layer ?

Context: I'm trying to generalize my toy rc_arena from supporting only Rc<T> to also supporting Rc<[T]>. This can be achieved by building a number of separator allocators of size [T; 2], [T; 4], [T; 8], ... and taking a [T] and rounding up the length to nearest power of 2.

So my issue here is now this. Suppose we have a x: [T] where x.len() = n, and n rounds up to 2^k -- so we take a RcBox<[T; 2^k]> ... but how do we get a RcBox<[T] of len n> out of it?

The length must match exactly to make that conversion.

However, there's a conversion from Vec<T> to Rc<[T]>. Does that work for you?

What do you mean by "map to"?

@alice
Here is the XY problem. Say we have T: clone, and the following func

impl My_Allocator {
  pub fn my_alloc(x: &[T]) -> My_Rc<[T]> {
    if x.len() <= 2 {
      get a My_RcBox<[T; 2]>; build a My_Rc<[T]>
    } else if x.len() <= 4 {
      get a My_RcBox<[T; 4]>; build a My_Rc<[T]>
    } else if x.len() <= 8 {
      get a My_RcBox<[T; 8]>; build a My_Rc<[T]>
    }
    ... else if x.len() <= 1024 {
      get a My_RcBox<[T; 1024]>; build a My_Rc<[T]>
    } else {
        hit the global allocator, return a My_Rc<[T]>
    }
  }
}

@steffahn

So a Rc<[X; N]> ends up being a

 ptr: NonNull<RcBox<T>>, // w T = [X; N]

and the RcBox looks like a

pub struct RcBox {
  strong: Cell<usize>,
  weak: Cell<usize>,
  value: [X; N]
}

Now, what does a Rc<[T]> look like at the RcBox layer ?

The RcBox will have the same representation as it does with [T; N].

2 Likes

Sorry, I'm completely missing something here. Are the following statements correct ?

  1. [T; N] has size known at compile time
  2. [T] has size known only at runtime.
  3. Therefore, [T; N] an just be a block of memory of size (N * sizeof_layout(T))
  4. [T] has to somewhere store a length: usize field.

How would a Rc<[T]> be laid out in memory? Does it map to a RcBox<[T]> ? If so, how is a RcBox<[T]> laid out in memory ?

To add more questions without providing any answers:

We're writing a custom allocator, so somewhere we have a ChunkList<T> which has within it a Vec<T> and a Vec<Vec<T>>

So we can store Vec<RcBox<[T; 2]>> , Vec<RcBox<[T; 4]>>, Vec<RcBox<[T; 8]>>, ... but afaik we can not store a RcBox<[T]>

So from our storage, we can get objects of type RcBox<[T; 2^k]> but we need to return something of type Rc<[T]> and it is not clear to me how to make this transition.

Yes, and that somewhere is the metadata field of the wide pointer that points to the [T]. In the case of RcBox<[T]>, which is itself unsized, the metadata is lifted to be contained within the pointer to RcBox<[T]> instead.

The NonNull pointer itself will be a fat pointer consisting of a pointer and a length. So the length is stored in the Rc, and not in the RcBox.

I'm not disputing with your technical accuracy. I just don't see the "mechanics" of how this works. Can you point me at Rust code ?

The context is: I have toy allocator that returns Rc<T>. I want to extend it to return Rc<[T]>. I can get it to return Rc<[T; 2^k]> ; but I don't see the mechanics of how to get this to return a Rc<[T]>.

Maybe take a look at this recent topic.

I'm sorry, I'm missing something very obvious. Jumping around in IntelliJ, I get this for NonNull

#[cfg_attr(not(test), rustc_diagnostic_item = "Rc")]
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_insignificant_dtor]
pub struct Rc<T: ?Sized> {
    ptr: NonNull<RcBox<T>>,
    phantom: PhantomData<RcBox<T>>,
}

#[stable(feature = "nonnull", since = "1.25.0")]
#[repr(transparent)]
#[rustc_layout_scalar_valid_range_start(1)]
#[rustc_nonnull_optimization_guaranteed]
pub struct NonNull<T: ?Sized> {
    pointer: *const T,
}

So our Rc<[T]> becomes a NonNull<RcBox<[T]>> which then contains pointer: *const RcBox<[T]>. Where is the size stored? Where is the fat pointer ?

The *const T will be a fat pointer.

A raw pointer is fat whenever the type is unsized. So it's part of the *const T.

You can try comparing size_of::<*const u32>() with size_of::<*const [u32]>().

1 Like

On that note, my post there mentions CoerceUnsized; the unstable nature of CoerceUnsized is also a reason why – unfortunately – it’s not possible to offer the full same flexibility as Rc or Box does in terms of unsizing coercions for custom user types. So your Rc clone for an arena’d Rc version would be slightly limited around ease of constructing it. Some crates offering rc/arc alternatives use a crate called unsized as a dependency to aid with offering at least some level of support for coercions akin to Rust’s built-in unsizing coercions.

Is this helpful? The information (last I checked), that currently the only way to actually instantiate unsized structs fully dynamically in some cases is carefully and manually with unsafe code.

1 Like

For example, the implementation of From<Box<T>> for Rc<T> in the standard library would presumably be a good example for such a construction; it obtains the (potentially) unsized type’s layout via Layout::for_value, and invokes some helper function that turns that further into the correct layout for RcBox<T>, which then is allocated. It uses ptr::write to initialize the field of the reference counters, and ptr::copy_nonoverlapping to copy over the right amount of bytes for the actual value of T. Finally it turns the original Box<T> into a Box<ManuallyDrop<T>> (via into_raw/from_raw) and drops that.

1 Like

@gretchenfrage @steffahn : Great, I think this is pointing me in the right direction with regards to RcBox<[T; 2^k]> -> RcBox<[T]>. I'm going go try some code and followup on this.

1 Like

Is this code correct ?

pub struct Foo<T: Clone, const N: usize> {
    // programmer guarantees that data[0 .. len-1] are initialized
    len: usize,
    data: [std::mem::MaybeUninit<T>; N],
}

impl<T: Clone, const N: usize> Drop for Foo<T, N> {
    fn drop(&mut self) {
        self.clear();
    }
}

impl<T: Clone, const N: usize> Foo<T, N> {
    fn clear(&mut self) {
        for i in 0..self.len {
            unsafe {
                std::ptr::drop_in_place(self.data.as_mut_ptr().add(i));
            }
        }
        self.len = 0;
    }

    pub fn copy(&mut self, t: &[T]) {
        assert!(t.len() <= N);
        self.clear();
        for i in 0..t.len() {
            self.data[i].write(t[i].clone());
        }
        self.len = t.len();
    }

    pub fn to_slice(&self) -> &[T] {
        unsafe { std::mem::transmute(&self.data[0..self.len]) }
    }
}

There are two uses of unsafe: clear() uses drop_in_place and to_slice uses it to transmute.

It is programmers job to ensure that the first self.len elements are initialized.

.to_slice() allows us to go from &[T; 2^k] -> &[T]
drop/clear makes sure we call drop handler on all initialized elements
.copy drops old values + copies in new slice

Is this correct, or is there some lurking UB ?