Timing code:
let ocr_data_set = OcrDataSet { db: data };
println!("encoding");
let start = std::time::Instant::now();
let encoded = bincode::serialize(&ocr_data_set).unwrap();
let end = std::time::Instant::now();
println!(
"done: {:?} gigabytes, seconds: {:?}",
encoded.len() as f32 / 1000. / 1000. / 1000.,
end.duration_since(start).as_secs()
);
Result:
encoding
done: 0.5636252 gigabytes, seconds: 39
=== done
WTF -- 39 seconds to encode 0.5 GB ?
Does OcrDataSet
have some weird structure?
#[derive(Clone, serde::Serialize, serde::Deserialize)]
pub struct ImageF32 {
height: usize,
width: usize,
data: Vec<f32>,
}
#[derive(Clone, serde::Serialize, serde::Deserialize)]
pub struct OcrDataSet {
pub db: HashMap<String, Vec<(u32, ImageF32)>>,
}
What am I doing wrong?
Obligatory question: are you building/running your program in release mode (i.e. cargo build --release
or cargo run --release
)?
Other people may be able to suggest optimizations you can do, but turning on the compiler's optimizations is always the best first step
3 Likes
I'm running from IntelliJ by clicking on the green arrow next to a unit test. Debug vs Relese may be the issue. Let me check.
EDIT: Running unit test with --release
encoding
done: 0.5636252 gigabytes, seconds: 1
Thanks!
2 Likes
This isn't exactly what you're asking for, but I got a helpful hint for something else you should do if you want to load that serialized data from a file: Serde + cbor very slow when used for saving and loading game maps - #4 by Nemo157
1 Like
Orthogonal to the discussion here: I recently forked bincode as the previous owners are not able to dedicate significant time to it, which I am now able to do. If your have any perf or other issues, please create a issue or PR against the forked repo above and I'll look into it right away.
2 Likes
system
Closed
April 13, 2020, 6:47pm
6
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.