Need suggestion for reading large files

#1

Hi, I have a bunch of files with a million lines each, I need to read line by line small chunks (~1000 lines) of it frequently. I have tried ‘BufRead.read_line()’, it’s great but performance is critical, so I want some suggestion especially in the case I can calculate the index of first and last line that I want.

#2

Can you share the code you’re using now? As is, I don’t think I understand the desired semantics, and code will help with that.

What is your current performance? What is your performance target?

#3

Unfortunately, I don’t have the code until weekend but the code bellow similar to it.

let start = 2000; 
let end = 3000;

let file = File::open("./file.txt")?;
let mut reader = BufReader::new(file);
let mut buf = String::new();
for _i in 0..start {
    reader.read_line(&mut buf)?;
    buf.clear();
}
for _i in start..end + 1 {
    reader.read_line(&mut buf)?;
}

Ok(buf)

Performance target is as fast as possible without use unsafe code.

Edit: How do we highlight syntax?

#4

That looks just fine. If you don’t care about using String and can instead use Vec<u8>, then you could use read_until to avoid UTF-8 validation.

However, I don’t think you’ve provided enough details here. You said you need to do this frequently. Is that because the file is mutating? If so, it might be faster to maintain an index that is updated after every write. Just depends on your workload.

I don’t think it’s possible to meaningfully answer such a generic question unfortunately.

1 Like