Bincode::serialize slow?

Timing code:

    let ocr_data_set = OcrDataSet { db: data };

    println!("encoding");
    let start = std::time::Instant::now();
    let encoded = bincode::serialize(&ocr_data_set).unwrap();
    let end = std::time::Instant::now();
    println!(
        "done: {:?} gigabytes, seconds: {:?}",
        encoded.len() as f32 / 1000. / 1000. / 1000.,
        end.duration_since(start).as_secs()
    );

Result:

encoding
done: 0.5636252 gigabytes, seconds: 39
=== done

WTF -- 39 seconds to encode 0.5 GB ?

Does OcrDataSet have some weird structure?

#[derive(Clone, serde::Serialize, serde::Deserialize)]
pub struct ImageF32 {
    height: usize,
    width: usize,
    data: Vec<f32>,
}

#[derive(Clone, serde::Serialize, serde::Deserialize)]
pub struct OcrDataSet {
    pub db: HashMap<String, Vec<(u32, ImageF32)>>,
}

What am I doing wrong?

Obligatory question: are you building/running your program in release mode (i.e. cargo build --release or cargo run --release)?

Other people may be able to suggest optimizations you can do, but turning on the compiler's optimizations is always the best first step :slight_smile:

3 Likes

I'm running from IntelliJ by clicking on the green arrow next to a unit test. Debug vs Relese may be the issue. Let me check.

EDIT: Running unit test with --release

encoding
done: 0.5636252 gigabytes, seconds: 1

Thanks!

2 Likes

This isn't exactly what you're asking for, but I got a helpful hint for something else you should do if you want to load that serialized data from a file: Serde + cbor very slow when used for saving and loading game maps

1 Like

Orthogonal to the discussion here: I recently forked bincode as the previous owners are not able to dedicate significant time to it, which I am now able to do. If your have any perf or other issues, please create a issue or PR against the forked repo above and I'll look into it right away.

2 Likes