Thin `Vec` representation

marcianx · April 13, 2020, 2:58am

I was thinking about minimizing the size of enums containing Vecs, and it just occurred to me: what's the disadvantage of storing the size and capacity fields of the standard library's Vec in the heap-allocated part? Specifically, Vec could be defined as (ignoring NonNull for simplicity):

struct Vec<T> {
  ptr: *const (),
  PhantomData<T>,
}

with the ptr being allocated with

an alignment of max(align_of<usize>(), align_of<T>()), and
an initial padding that is max(2*size_of<usize>(), align_of<T>()) bytes to be used to store size and cap.

Then, Vec would be the same size as Box. This would reduce the struct size of Vec and String 3x and allow them to be passed as a single register.

The slight disadvantages that come to mind are:

The pointer would always require an alignment of at least align_of<usize>().

len() would require a branch:

if self.is_empty() { // self.ptr == 0x1
    0
} else {
    unsafe { ptr::read(self.ptr as *const usize) }
}

Given that this is not how Vecs are typically implemented in system languages, is there some major disadvantage that alludes me?

Hyeonu · April 13, 2020, 3:04am

Cache miss and branching are the main targets of the performance micro-optimization. One thing to note is that with this approach every .len() may triggers cache miss.

marcianx · April 13, 2020, 3:12am

Thanks for the reply.

I can imagine that would only really be an issue when accessing length without intending to access the vec itself?
Still, if len is the major thing to optimize for, one could compromise by storing the size inline and the cap on the heap-allocated part, reducing Vec from 3 words to 2. Is there still a major disadvantage of this approach?

cuviper · April 13, 2020, 3:16am

There's a ThinVec like that in thincollections.

mbrubeck · April 13, 2020, 4:10am

https://crates.io/crates/smallbitvec Is a specialized container that also uses this layout. It's used in Firefox and Servo.

kornel · April 14, 2020, 11:39am

BTW, if you don't need the Vec to grow, then Box<[T]> is a bit thinner: only ptr + size.

system · July 13, 2020, 11:39am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Vec<Box<dyn Trait>> vs. using enum with Trait help	14	1941	April 16, 2021
Vec with minimal length help	18	1651	January 12, 2023
Regarding SmallVec implementation	2	1376	January 12, 2023
Performance optimization of Vec containing Structs help	5	949	March 4, 2022
Requesting review for `unsized-vec` code review	15	673	January 31, 2023

Thin `Vec` representation

Related Topics