Is my Vec<u8> line-splitter-cache module safe?

My program works roughly like this:

let mut current_file_contents = Vec::<u8>::new();
let mut current_file_name = PathBuf::new();
for (file_name, interesting_line_num) in input_data_iter {
    if current_file_name != file_name {
        current_file_contents.clear();
        File::open(file_name).unwrap().read_to_end(&mut current_file_contents).unwrap();
        current_file_name = file_name.clone();
    }
    handle_interesting_line(&current_file_contents, interesting_line_num);
}

Where handle_interesting_line splits current_file_contents into lines and looks around the interesting_line_num for certain patterns.

After profiling my program (AMD uProf rocks btw), I noticed most of it was in the line splitting (The files may be pretty big and have lots of interesting lines, thus O(N^2))

I tried to store a Vec<&[u8]> which was the output of current_file_contents.lines(), but encountered borrow-checker problems because it cannot prove I won't invalidate current_file_contents without updating my line map to match.

So I wrote this struct that supposedly ensures all updates to the backing Vec cause an update of the line map, and it works (about 2x speedup and no crashes). Is the code really safe? Am I missing something?

A solution that turns my input iterator into an Iter<(PathBuf, Iter<InterestingLine>)> is also acceptable, which would sidestep the problem entirely :slight_smile:

No it's unsound. Here's an example which triggers the Use-after-free situation.

let mut cache = BytesSplitCache::new();
cache.update(|buf| buf.extend_from_slice("foo\nbar\nbaz".as_bytes());
let mut lines: Vec<&'static [u8]> = Vec::new();
cache.view_lines(|slice| lines = slice.into_iter().collect());
cache.update(|_| {});
println!("prev lines: {:?}", lines); // UB!!!
2 Likes

Good point, How do you constrain slice to the view_lines callback then? Can you create a lifetime just for the closure?

EDIT: Figured it out:

Needed to declare view_lines as

pub fn view_lines<F>(&self, cb: F) where F: for<'a> FnOnce(&[&'a [u8]]) { ... }

instead.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.