The KV question - how to store a vec of keys and values

I have just been thinking about this question, and wanted to check if my reasoning is right.

I think it depends on both the size and alignment of K and V.

For example if K has size and align 1, and V has size 8 and align 8, you use less memory by storing them as two vecs.

But if for example K has size and align 1, and V has size 7 and align 8, well it is better to store them as a single (K,V) vec. Or a (V,K) vec, does it matter???

Is this right? I am not used to thinking about alignment.

Indeed there are multiple options depending on sizes, alignment, as well as what memory locality you want.

For example if you expect to iterate over values without looking at the keys, then Vec<V> will be better.

But if you query keys and values at random, from large sets, then reading from both Vec<K> and Vec<V> will double number of potential cache misses.

If you put both K and V in a struct, then Rust will reorder them to minimize the struct size.

If you expect large alignment to be wasteful, you could use #[repr(packed)], but then field access will require copying.

3 Likes

I am talking about Layout in std::alloc - Rust here, and K and V are generic, so could each be any size or align. I just programmed a Vec type that store them in two arrays, like this:


use std::marker::PhantomData;

/// Vector of (key,value) pairs, keys stored separately from values for cache efficient search.
pub struct PairVec<K, V> {
    p: NonNull<u8>,
    len: u16,      // Current length
    alloc: u16,    // Allocated
    capacity: u16, // Maximum capacity
    _pd: PhantomData<(K, V)>,
}

impl<K, V> Default for PairVec<K, V> {
    fn default() -> Self {
        Self::new(0)
    }
}

impl<K, V> Drop for PairVec<K, V> {
    fn drop(&mut self) {
        while self.len != 0 {
            self.pop();
        }
        self.trim();
    }
}

impl<K, V> PairVec<K, V> {
    /// ...
    pub fn new(capacity: usize) -> Self {
        let capacity = capacity as u16;
        Self {
            p: NonNull::dangling(),
            len: 0,
            alloc: 0,
            capacity,
            _pd: PhantomData,
        }
    }

    /// ...
    pub fn len(&self) -> usize {
        self.len as usize
    }

    /// ...
    pub fn is_empty(&self) -> bool {
        self.len == 0
    }

    /// ...
    pub fn full(&self) -> bool {
        self.len == self.capacity
    }

    /// ...
    pub fn cap(&self) -> usize {
        self.capacity as usize
    }

    #[inline]
    unsafe fn layout(amount: usize) -> (Layout, usize) {
        let layout = Layout::array::<K>(amount).unwrap();
        let (layout, off) = layout.extend(Layout::array::<V>(amount).unwrap()).unwrap();
        (layout, off)
    }

    ///
    pub fn trim(&mut self) {
        self.alloc(self.len());
    }

    /// ...
    pub fn allocate(&mut self, mut amount: usize) {
        if amount > self.capacity as usize {
            amount = self.capacity as usize;
        }
        self.alloc(amount);
    }

    /// ...
    fn alloc(&mut self, amount: usize) {
        unsafe {
            let (old_layout, old_off) = Self::layout(self.alloc as usize);

            let np = if amount == 0 {
                NonNull::dangling()
            } else {
                let (layout, off) = Self::layout(amount);
                let np = alloc::alloc(layout);
                let np = match NonNull::new(np.cast::<u8>()) {
                    Some(np) => np,
                    None => alloc::handle_alloc_error(layout),
                };

                // Copy keys and values from old allocation to new allocation.
                if self.len > 0 {
                    let from = self.p.as_ptr().cast::<K>();
                    let to = np.as_ptr().cast::<K>();
                    ptr::copy_nonoverlapping(from, to, self.len as usize);

                    let from = self.p.as_ptr().add(old_off).cast::<V>();
                    let to = np.as_ptr().add(off).cast::<V>();
                    ptr::copy_nonoverlapping(from, to, self.len as usize);
                }
                np
            };

            // Free the old allocation.
            if self.alloc > 0 {
                alloc::dealloc(self.p.as_ptr(), old_layout);
            }
            self.alloc = amount as u16;
            self.p = np;
        }
    }

    /// ...
    pub fn split_off(&mut self, at: usize, r: usize) -> Self {
        safe_assert!(at <= self.len());
        safe_assert!(r <= 1);
        let len = self.len() - at;
        let mut result = Self::new(self.capacity as usize);
        result.allocate(len + r * 5);
        unsafe {
            let (kf, vf) = self.ixmp(at);
            let (kt, vt) = result.ixmp(0);
            ptr::copy_nonoverlapping(kf, kt, len);
            ptr::copy_nonoverlapping(vf, vt, len);
        }
        result.len = len as u16;
        self.len -= len as u16;
        self.allocate(self.len() + (1 - r)*5);
        result
    }

    /// ...
    pub fn insert(&mut self, at: usize, (key, value): (K, V)) {
        safe_assert!(self.len < self.capacity);
        safe_assert!(at <= self.len());
        if self.alloc == self.len {
            self.allocate(self.len() + 5);
        }
        unsafe {
            let n = self.len() - at;
            let (kp, vp) = self.ixmp(at);
            if n > 0 {
                ptr::copy(kp, kp.add(1), n);
                ptr::copy(vp, vp.add(1), n);
            }
            kp.write(key);
            vp.write(value);
            self.len += 1
        }
    }

    /// ...
    pub fn remove(&mut self, at: usize) -> (K, V) {
        safe_assert!(at < self.len());
        unsafe {
            let n = self.len() - at - 1;
            let (kp, vp) = self.ixmp(at);
            let result = (kp.read(), vp.read());
            if n > 0 {
                ptr::copy(kp.add(1), kp, n);
                ptr::copy(vp.add(1), vp, n);
            }
            self.len -= 1;
            result
        }
    }

    /// ...
    pub fn push(&mut self, (key, value): (K, V)) {
        safe_assert!(self.len < self.capacity);
        if self.alloc == self.len {
            self.allocate(self.len() + 5);
        }
        unsafe {
            let (kp, vp) = self.ixmp(self.len());
            kp.write(key);
            vp.write(value);
            self.len += 1;
        }
    }
    /// ...
    pub fn pop(&mut self) -> Option<(K, V)> {
        unsafe {
            if self.len == 0 {
                return None;
            }
            self.len -= 1;
            let (kp, vp) = self.ixmp(self.len());
            Some((kp.read(), vp.read()))
        }
    }

    /// Same as `binary_search_by`, but for some obscure reason this seems to be faster.
    #[inline]
    pub fn search<F>(&self, f: F) -> Result<usize, usize>
    where
        F: FnMut(&K) -> Ordering,
    {
        self.search_to(self.len(), f)
    }

    /// ...
    pub fn search_to<F>(&self, n: usize, mut f: F) -> Result<usize, usize>
    where
        F: FnMut(&K) -> Ordering,
    {
        safe_assert!(n <= self.len());
        let (mut i, mut j) = (0, n);
        while i < j {
            let m = (i + j) / 2;
            match f(self.ixk(m)) {
                Ordering::Equal => return Ok(m),
                Ordering::Less => i = m + 1,
                Ordering::Greater => j = m,
            }
        }
        Err(i)
    }

    /// ...
    pub fn retain<F>(&mut self, mut f: F)
    where
        F: FnMut(&K, &mut V) -> bool,
    {
        unsafe {
            let mut i = 0;
            let mut r = 0;
            while i < self.len() {
                let (k, v) = self.ixm(i);
                if f(k, v) {
                    if r != i {
                        let kv = self.get(i);
                        self.set(r, kv);
                    }
                    r += 1;
                } else {
                    self.get(i);
                }
                i += 1;
            }
            self.len -= (i - r) as u16;
            self.trim();
        }
    }

    #[inline]
    unsafe fn get(&mut self, i: usize) -> (K, V) {
        let (kp, vp) = self.ixmp(i);
        (kp.read(), vp.read())
    }

    #[inline]
    unsafe fn set(&mut self, i: usize, (k, v): (K, V)) {
        let (kp, vp) = self.ixmp(i);
        kp.write(k);
        vp.write(v);
    }

    #[inline]
    pub unsafe fn ixmp(&mut self, i: usize) -> (*mut K, *mut V) {
        let (_, off) = Self::layout(self.alloc as usize);
        let kp = self.p.as_ptr().cast::<K>().add(i);
        let vp = self.p.as_ptr().add(off).cast::<V>().add(i);
        (kp, vp)
    }

    #[inline]
    pub unsafe fn ixp(&self, i: usize) -> (*const K, *const V) {
        let (_, off) = Self::layout(self.alloc as usize);
        let kp = self.p.as_ptr().cast::<K>().add(i);
        let vp = self.p.as_ptr().add(off).cast::<V>().add(i);
        (kp, vp)
    }

    #[inline]
    pub fn ixk(&self, i: usize) -> &K {
        safe_assert!(i < self.len());
        unsafe {
            let kp = self.p.as_ptr().cast::<K>().add(i);
            &*kp
        }
    }

    #[inline]
    pub fn ixmv(&mut self, i: usize) -> &mut V {
        safe_assert!(i < self.len());
        unsafe {
            let (_kp, vp) = self.ixmp(i);
            &mut *vp
        }
    }

    #[inline]
    pub fn ix(&self, i: usize) -> (&K, &V) {
        safe_assert!(i < self.len());
        unsafe {
            let (kp, vp) = self.ixp(i);
            (&*kp, &*vp)
        }
    }

    #[inline]
    pub fn ixm(&mut self, i: usize) -> (&K, &mut V) {
        safe_assert!(i < self.len());
        unsafe {
            let (kp, vp) = self.ixmp(i);
            (&*kp, &mut *vp)
        }
    }

    #[inline]
    pub fn ixbm(&mut self, i: usize) -> (&mut K, &mut V) {
        safe_assert!(i < self.len());
        unsafe {
            let (kp, vp) = self.ixmp(i);
            (&mut *kp, &mut *vp)
        }
    }

    pub fn iter(&self) -> IterPairVec<K, V> {
        IterPairVec {
            v: self,
            ix: 0,
            ixb: self.len(),
        }
    }
    /// ...
    pub fn range(&self, x: usize, y: usize) -> IterPairVec<K, V> {
        safe_assert!(x <= y && y <= self.len());
        IterPairVec {
            v: self,
            ix: x,
            ixb: y,
        }
    }
    /// ...
    pub fn iter_mut(&mut self) -> IterMutPairVec<K, V> {
        let ixb = self.len();
        IterMutPairVec {
            v: self,
            ix: 0,
            ixb,
        }
    }
    /// ...
    pub fn range_mut(&mut self, x: usize, y: usize) -> IterMutPairVec<K, V> {
        safe_assert!(x <= y && y <= self.len());
        IterMutPairVec {
            v: self,
            ix: x,
            ixb: y,
        }
    }
    /// ...
    pub fn into_iter(self) -> IntoIterPairVec<K, V> {
        let ixb = self.len();
        IntoIterPairVec {
            v: self,
            ix: 0,
            ixb,
        }
    }
}

impl<K, V> fmt::Debug for PairVec<K, V>
where
    K: Debug,
    V: Debug,
{
    fn fmt(&self, fmt: &mut fmt::Formatter<'_>) -> fmt::Result {
        fmt.debug_map()
            .entries(self.iter().map(|(k, v)| (k, v)))
            .finish()
    }
}

#[derive(Debug, Clone)]
pub struct IterPairVec<'a, K, V> {
    v: &'a PairVec<K, V>,
    ix: usize,
    ixb: usize,
}
impl<'a, K, V> IterPairVec<'a, K, V> {
    pub fn len(&self) -> usize {
        self.ixb - self.ix
    }
}
impl<'a, K, V> Iterator for IterPairVec<'a, K, V> {
    type Item = (&'a K, &'a V);
    fn next(&mut self) -> Option<Self::Item> {
        unsafe {
            if self.ix == self.ixb {
                return None;
            }
            let (kp, vp) = self.v.ixp(self.ix);
            self.ix += 1;
            Some((&*kp, &*vp))
        }
    }
}
impl<'a, K, V> DoubleEndedIterator for IterPairVec<'a, K, V> {
    fn next_back(&mut self) -> Option<Self::Item> {
        unsafe {
            if self.ix == self.ixb {
                return None;
            }
            self.ixb -= 1;
            let (kp, vp) = self.v.ixp(self.ixb);
            Some((&*kp, &*vp))
        }
    }
}

#[derive(Debug)]
pub struct IterMutPairVec<'a, K, V> {
    v: &'a mut PairVec<K, V>,
    ix: usize,
    ixb: usize,
}
impl<'a, K, V> IterMutPairVec<'a, K, V> {
    pub fn len(&self) -> usize {
        self.ixb - self.ix
    }
}
impl<'a, K, V> Iterator for IterMutPairVec<'a, K, V> {
    type Item = (&'a K, &'a mut V);
    fn next(&mut self) -> Option<Self::Item> {
        unsafe {
            if self.ix == self.ixb {
                return None;
            }
            let (kp, vp) = self.v.ixmp(self.ix);
            self.ix += 1;
            Some((&mut *kp, &mut *vp))
        }
    }
}
impl<'a, K, V> DoubleEndedIterator for IterMutPairVec<'a, K, V> {
    fn next_back(&mut self) -> Option<Self::Item> {
        unsafe {
            if self.ix == self.ixb {
                return None;
            }
            self.ixb -= 1;
            let (kp, vp) = self.v.ixmp(self.ixb);
            Some((&mut *kp, &mut *vp))
        }
    }
}

#[derive(Debug)]
pub struct IntoIterPairVec<K, V> {
    v: PairVec<K, V>,
    ix: usize,
    ixb: usize,
}
impl<K, V> IntoIterPairVec<K, V> {
    pub fn len(&self) -> usize {
        self.ixb - self.ix
    }
}
impl<K, V> Iterator for IntoIterPairVec<K, V> {
    type Item = (K, V);
    fn next(&mut self) -> Option<Self::Item> {
        unsafe {
            if self.ix == self.ixb {
                return None;
            }
            let (kp, vp) = self.v.ixmp(self.ix);
            self.ix += 1;
            Some((kp.read(), vp.read()))
        }
    }
}
impl<K, V> DoubleEndedIterator for IntoIterPairVec<K, V> {
    fn next_back(&mut self) -> Option<Self::Item> {
        unsafe {
            if self.ix == self.ixb {
                return None;
            }
            self.ixb -= 1;
            let (kp, vp) = self.v.ixmp(self.ixb);
            Some((kp.read(), vp.read()))
        }
    }
}
impl<K, V> Drop for IntoIterPairVec<K, V> {
    fn drop(&mut self) {
        while self.ix != self.ixb {
            self.next();
        }
        self.v.len = 0;
    }
}

#[test]
fn test_kv() {
    let mut v = PairVec::new(4);
    v.push((2, 2));
    v.insert(0, (1, 1));
    println!("ixm(0) = {:?}", v.ixm(0));
    println!("ixm(1) = {:?}", v.ixm(1));

    for kv in v.iter_mut() {
        println!("kv={:?}", kv);
    }

    //println!("pop = {:?}", v.pop().unwrap());
    //println!("pop = {:?}", v.pop().unwrap());
    println!("remove(0)= {:?}", v.remove(0));

    for kv in v.into_iter() {
        println!("kv={:?}", kv);
    }
}

Rust makes these layout choices after monomorphization, so it can optimize for the specific K and V in each case.

1 Like

I am now thinking if I use

and

these are const functions, so any conditional code using them will be "zero-cost" (or something,... ), eliminated at compile time. And I think I have two cases to consider, use two arrays or use (K,V). Not sure yet how this will work out though.

Another advantage of separate vectors is searching through the Ks may be more efficient, because they are more densely packed, improving cache locality. (But once you have found the right key, fetching its associated V will be reading from another heap allocation, so it may be worse for small collections.)

But if for example K has size and align 1, and V has size 7 and align 8,

This case cannot happen, because in Rust, the size of a value is defined to be always a multiple of its alignment. Therefore, a [T] slice, or any container that stores one such as a Vec<T>, is always maximally dense — there are no gaps to squeeze in another value, unless that additional value is zero-sized. (No, you can't stuff a value into the padding bytes of another value, either.)

3 Likes

I think that this might be technically possible if you really work at it, but you’d need to be working with non-generic #[repr(C)] types and use some unusual unsafe code.

At a minimum, you’d need to hand-code every write of the outer type to not touch the padding bytes that are aliased with the inner type. Similarly, if you ever produce an &mut to the outer type, you’ll need to assume that the inner type has been overwritten with uninit by the time the borrow expires.

1 Like

Hmm, so does this mean you never save any space by using one rather than two arrays? At this point I am definitely not thinking well about this!

By using a single vector, you save a small amount of space on having one instead of two

  • heap allocations tracked by the allocator, and
  • Vec values themselves (pointer, length, capacity),

but you cannot save any space via packing multiple things into one struct/tuple as the vector's element types, and might waste space on additional padding in that tuple.

[Edited for clarity]

1 Like

That’s not necessarily true: The compiler may need to add padding bytes to a compound type (K,V) that aren’t necessary for Ks and Vs stored in separate allocations. If K has alignment 4 while V has size 2, for example, (K,V) will include 2 padding bytes to ensure that the Ks are always properly aligned.

If they’re stored in separate arrays, however, the Vs will be tightly packed without the extra padding.

1 Like

I am only allocating once ( with the layout including two arrays), and there is only one vec length. So two arrays seems mostly best, although when increasing the allocation, realloc doesn't work so well ( hard to know how much this matters, as realloc usually has to move everything anyway (presumably!!), I guess, in most cases. I am not using realloc as that could mean some of the data has to move twice ).

Although... I am not sure about this, but there may be a minor loss of performance which I don't yet understand.

I was responding to the question "you never save any space by using one rather than two arrays?". The answer to that is always “correct, you cannot” (regarding the layout of the slice itself). But I agree that it would have been better to make it more explicit, and also say “…but you might waste space by doing that”, as you pointed out. (Edited the previous post.)

2 Likes

I made a little playground test showing how two arrays can be smaller than one:

fn main() {
   type A = (u16,u8);
   type B = u8;
   type C = [(A,B);10];
   type D = ( [A;10], [B;10] );
   
    println!( "{} {} {} {}", 
       std::mem::size_of::<A>(), 
       std::mem::size_of::<B>(),
       std::mem::size_of::<C>(), 
       std::mem::size_of::<D>()
    )
}

Output:

4 1 60 50

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.