Vec<T> vs (Vec<usize>, Vec<T>) for eady-heavy, 32 bytes <= T <= 128 bytes

Suppose we have a Vec<T> where 32 bytes <= T <= 128 bytes, and our access is read heavy.

Two ways of representing this is:

pub struct Direct {
  data: Vec<T>
}

pub struct InDirect {
  idxs: Vec<u16>, // T takes on <= 2 ^16 vaules
  data: Vec<T>
}

when processing all elements, our two choices are:

impl Direct {
  pub fn do_stuff(&self) {
    for x in self.data { ... }
  }
}

impl InDirect {
  pub fn do_stuff(&self) {
    for x in self.idxs {
      let v = self.data[x as usize]; ...
    }
  }
}

In practice of memory access, is Direct (reading memory straight through) strictly better than InDirect (reads one vec straight through, jumps randomly in other thread); or this is a pedantic detail that rarely matters in practice ?

The memory subsystem in your CPU will notice sequential reads of memory, and will prefetch appropriately to avoid the read lag (as much as it can). So it'll be O(N) time to read N items.

If you read all N items in a random order instead, then it'll take about O(N √N) time to read all of them.

where "random" means "something that a realistic memory system will not be able to predict".

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.