Help me write this in safe Rust

So I'm working on a "Rope" data structure, which is a "chunked" vector that allows (relatively) quick inserts and deletes.

It looks like this:

#[derive(Clone, Debug)]
struct RopeChunk<T: Copy> {
    idx: usize,
    content: Vec<T>,
}

#[derive(Clone, Debug)]
pub struct Rope<T: Copy> {
    len: usize,
    chunks: Vec<RopeChunk<T>>,
}

Easy peasy.

I need a method with this signature:

pub fn get_mut<U: RangeBounds<usize>>(&mut self, range: U) -> Vec<&mut [T]> 

Many of you can now imagine my suffering.

Here is what I came up with:

    pub fn get_mut<U: RangeBounds<usize>>(&mut self, range: U) -> Vec<&mut [T]> {
        let (start, end) = range_helper(range, self.len);
        if start == end {
            Vec::new()
        } else {
            let mut result = Vec::new();
            let chunks = self.get_chunks(start, end);
            let (_, main_split) = self.chunks.split_at_mut(chunks[0].0);
            let mut len = main_split.len();
            let mut remaining_slices = main_split.as_mut_ptr();

            for (ch_idx, ch_start, ch_end) in chunks {
                let cur;
                let next;
                unsafe {
                    cur = slice::from_raw_parts_mut(remaining_slices, 1)
                        .get_mut(0)
                        .unwrap();
                    next = slice::from_raw_parts_mut(remaining_slices.offset(1), len);
                }
                len += 1;
                remaining_slices = next.as_mut_ptr();
                result.push(&mut cur.content[ch_start..ch_end]);
            }
            result
        }
    }

The self.getChunks(...) method figures out which RopeChunks we need, along with what sub-ranges within them.

I tried doing this with slice::split_at_mut(), but I kept getting all kinds of borrow errors, since I needed to split it repeatedly in a loop. The borrow checker didn't like that, not at all. Anyhow, I gave up and used pointers.

I've tested this a fair bit, even at the boundaries. It seems to work.

But still...

But still...

Can this be done in safe Rust?

Can you do something like this?

let mut ret = vec![];
for chunk in &mut self.chunks {
    /* chunk index bookkeeping */

    if /*chunk intersects range*/ {
        ret.push(&mut chunk.content[/*chunk-range intersection*/])
    }
}
ret

The chunks Vec can get pretty long relative to the length of a range, so that would kill my big-oh behavior.

Oh, I realized my first draft (above) was pretty silly. I simplified it a lot:

    pub fn get_mut<U: RangeBounds<usize>>(&mut self, range: U) -> Vec<&mut [T]> {
        let (start, end) = range_helper(range, self.len);
        if start == end {
            Vec::new()
        } else {
            let mut result = Vec::new();
            let chunks = self.get_chunks(start, end);

            let mut ptr: *mut RopeChunk<T>;
            unsafe {
                ptr = self.chunks.as_mut_ptr().offset(chunks[0].0 as isize);
            }

            for (ch_idx, ch_start, ch_end) in chunks {
                let cur;
                unsafe {
                    cur = &mut *ptr;
                    ptr = ptr.offset(1);
                }
                result.push(&mut cur.content[ch_start..ch_end]);
            }
            result
        }
    }

I guess I don't really understand exactly how your data structure works, but don't you need to sum the lengths of all preceding chunks until you have found the right one? What is the implementation of get_chunks?

Right now it's this (although I plan to tweak it a lot):

    fn lookup_chunk_idx(&self, idx: usize) -> Result<usize, usize> {
        self.chunks.binary_search_by(|r| {
            if idx < r.idx {
                Ordering::Greater
            } else if idx >= r.idx + r.content.len() {
                Ordering::Less
            } else {
                Ordering::Equal
            }
        })
    }

    fn get_chunks(&self, start: usize, end: usize) -> Vec<(usize, usize, usize)> {
        debug_assert!(start < end);
        let start_chunk_idx = self.lookup_chunk_idx(start).unwrap();
        let end_chunk_idx = self.lookup_chunk_idx(end - 1).unwrap();

        let mut result = Vec::with_capacity(1 + end_chunk_idx - start_chunk_idx);
        for ch_idx in start_chunk_idx..=end_chunk_idx {
            let chunk = &self.chunks[ch_idx];
            let chunk_start = chunk.idx;
            let chunk_end = chunk.idx + chunk.content.len();
            let result_start = start.max(chunk_start);
            let result_end = end.min(chunk_end);
            result.push((ch_idx, result_start - chunk.idx, result_end - chunk.idx));
        }
        result
    }

The idx field of each Chunk is it's starting index, which has to be maintained by any modifications.

Ok how about this?

let mut ret = vec![];
let start_chunk_idx = /* binary search */
for chunk in &mut self.chunks[start_chunk_idx..] {
    /* chunk index bookkeeping */

    if /*chunk intersects range*/ {
        ret.push(&mut chunk.content[/*chunk-range intersection*/])
    } else {
        break;
    }
}
ret
1 Like

That looks promising. I'll try it out.

Yay!!!!!

Here it is:

    pub fn get_mut<U: RangeBounds<usize>>(&mut self, range: U) -> Vec<&mut [T]> {
        let (start, end) = range_helper(range, self.len);
        if start == end {
            Vec::new()
        } else {
            let mut result = Vec::new();
            let chunks = self.get_chunks(start, end);

            let first_chunk = chunks.first().unwrap().0;
            let last_chunk = chunks.last().unwrap().0;

            let mut ch_i = 0;

            for ch in &mut self.chunks[first_chunk..=last_chunk] {
                let (_, start, end) = chunks[ch_i];
                ch_i += 1;
                result.push(&mut ch.content[start..end]);
            }

            result
        }
    }
1 Like

Thanks, it's working nice. And safe :grinning: