How to provide slice-like, contiguous view of disjoint slices

I’d like to implement a struct ContiguousView that, given a number of slices, can be used similarly to a slice or a Vec to give a contiguous view of the slices, without copying the contents of those slices.

Here’s a test method that maybe illustrates what I’m looking for:

struct ContiguousView<'a> {
    slices: Vec<&'a [u8]>

fn test_view_range() {
    let s0 = "abc".as_bytes();
    let s1 = "xyz".as_bytes();
    let v = ContiguousView::new(vec![s0, s1]);
    assert_eq!("abc".as_bytes(), &v[0..3]);
    assert_eq!("xyz".as_bytes(), &v[3..6]);
    assert_eq!("ab".as_bytes(), &v[0..2]);
    assert_eq!("yz".as_bytes(), &v[4..6]);
    assert_eq!("cx".as_bytes(), &v[2..4]);
    assert_eq!("abcxyz".as_bytes(), &v[0..6]);

Again, once created, I want to use v like a slice and only allocate and copy their contents if absolutely necessary.

If it makes things easier, let’s assume the slices are immutable.

What would be the best way to go about this?

You can’t present them as slices because slices, by definition, are contiguous ranges of memory; the slices you’re holding in slices do not guarantee that across them.

Do you need a slice interface however? You can represent this via an iterator over the data, where the iterator knows the slice boundaries and can jump around them internally - this would still be zero-copy.

1 Like

Thanks! I don’t need them to be slice, I just want them to be usable in the same way: I want to index them, slice them further (i.e. use Range as an index), I want to easily iterate over them, etc.

Can you give me some pointers to exactly which traits I’d want to implement for the ContiguousView struct itself and for the associated iterator struct?

The returned iterator would implement Iterator<Item=u8> in this case. As for the API, I think you might want to look at something like:

 // it's not really a "view" anymore, but ok 
use std::ops::RangeBounds;
impl<'a> ContiguousView<'a> {
    pub fn iter_range<R: RangeBounds>(&self, r: R) -> impl Iterator<Item = u8> {
         // `start` tells you the type of bound (inclusive, exclusive) and the value
         let start = r.start_bound();
         // ditto for `end`
         let end = r.end_bound();
         // now use these to tell your iterator impl where to start and stop producing values

You can also look at some of the other std::ops::Range* types if you want to simplify things a bit and accept only inclusive/exclusive ranges by design. RangeBounds is the most flexible for the caller, however.

Maybe you’re looking for a rope, a large buffer that can be inserted into / removed from in the middle, implemented as a tree of arrays.

1 Like

Not sure what you mean by rope. And no, I don’t really want to mutate the slices/view.

The main use would be to read into slices from IO (i.e. TCP), identify message boundaries within those slices and then providing “views” to complete and “contiguous” messages.

As I understand, though, with your solution, I couldn’t do

// say v is a ContiguousView struct...
for b in v {
  // do things

but rather, I’d have to do

for b in v.iter_range([0..v.len() - 1]) {
  // do things

which is really quite different from how you would treat a slice.

You could make ContiguousView implement IntoIterator, which would return the same iterator as the other method with .. as the range (ie RangeFull).

For this type of use case, take a look at the bytes crate (if you haven’t yet). It’s used by tokio to accumulate received bytes and then you can split off a full frame; the split off piece points back to the underlying buffer, so no payload is copied.

There’s also the iovec crate that supports vectored IO (I can’t quite tell the IO mechanics you’re using so this may be irrelevant).

Not sure if either of these make your life easier but some existing art to consider.

1 Like

Thanks, I will look into these.

I actually have implemented IntoIterator and got it working.

What I’m having issues with the most right now is implementing Index for ContiguousView. The return value should be an iterator but I can’t allocate a new iterator struct in fn index because of lifetimes.

Yes, this is a somewhat of a known issue with Index - the return type is defined as a borrow from self, and precludes returning a value. No real good solution for this, apart from offering a method based API (so you lose the indexing sugar) similar to what I suggested upthread.