I have a binary file, it stored m * n numbers, in fact, it is a table. And the size of every number is 4 bytes. So what is the fastest way to read the file and push the number to a Vec?
I have an implementation, but it is slow, the pseudo code is like below:
let mut f = File::open("path").unwrap();
let mut buf = vec![0; 100000];
f.read_exact(&mut buf).unwrap();
let mut idx = 0;
let mut data = vec![0;25000];
while idx < buf.len(){
let mut cur = std::io::Cursor::new(&buf[idx..(idx+4)]);
let n = cur.read_f32::<byteorder::ByteOrder>().unwrap();
data[idx/4] = n;
idx += 4;
}
Obligatory question: are you running your code with optimizations turned on? (e. g. cargo run --release)
If there's still performance problems, decreasing the number of index checks might help. E. g. you could try to iterate over (chunk, dest) in buf.chunks_exact(4).zip(&mut data) with a for loop that calls …byteorder….read_f32(chunk) and writes the result to *dest.
I am not running in release mode, but I added the [profile.test] opt-level = 3 in Cargo.toml.
And I also compared with the numpy.fromfile, it is faster than above code.
You can do this instead. I'm not sure it will actually improve performance, but it should at least as fast as your code.
let mut f = File::open("path").unwrap();
let mut buf = vec![0; 100000];
f.read_exact(&mut buf).unwrap();
let data: Vec<_> = buf.chunks(4).map(|s| f32::from_be_bytes(s.try_into().unwrap())).collect();
Just a few notes, what's the endianness of your file? byteorder::ByteOrder is not a valid generic parameter to the .read_f32 method. If it's marginally slower than the numpy approach, it's possible that the numpy is doing some more cheats - like mmap the file and use it as a f32 array. in this case actual file loading is lazily done by the OS and the copying is skipped.
Then you can read data directly into the buffer like this:
let mut buf = vec![0f32; 25000];
f.read_exact(as_byte_array(&mut buf)).unwrap();
// buf now contains f32 values from the file
You can also implement it without using the zerocopy crate, but then you need unsafe:
fn as_byte_array(floats: &mut [f32]) -> &mut [u8] {
let len = floats.len();
let ptr = floats.as_mut_ptr();
// Safety: The pointer is valid for 4*len bytes since a f32 is four bytes,
// and the alignment is also okay since u8 has a smaller alignment than
// f32.
unsafe {
std::slice::from_raw_parts_mut(ptr.cast(), 4*len)
}
}
I believe this method is more flexible.
If the number size is 3 bytes, do I need to change the 4*len to 3*len in the from_raw_parts_mut? Is the result also correct?
And thanks your answer!
Your parsing code is way too complicated. There is no point in using a Cursor if you're not doing random-access operations on the buffer, and you're just reading a single number. You can read it directly. Moreover, there is no point in manually slicing the buffer and potentially incurring indexing costs when a slice can be directly used as a reader. Here is how I would write your code:
let mut buf = &std::fs::read("path").expect("could not read path");
let mut data = Vec::with_capacity(buf.len() / std::mem::size_of::<f32>());
while let Ok(n) = buf.read_f32() {
data.push(n);
}
You can't just change it to 3*len in my method. Each float must consist of four bytes when you are done manipulating the bytes. So it would require additional modification to the data after reading it.
use byteorder::{ReadBytesExt, LE};
let buf = std::fs::read("path").expect("could not read path");
let mut data = Vec::with_capacity(buf.len() / std::mem::size_of::<f32>());
let mut reader = buf.as_slice();
while let Ok(n) = reader.read_f32::<LE>() {
data.push(n);
}