Try running it through a profiler (you can enable debug info in release builds, or optimization in debug builds - see Cargo profiles). Rust is compatible with profilers for C.
I'm using msgpack, which is sort of similar, and it's nearly instant for 500MB files.
I would guess that practically all of the time is in HashMap insertion. Practically none of it should be due to bincode, messagepack, or string allocations which take almost no time on this scale.
I would recommend benchmarking in a way that allows you to compare deserialization time against the time for purely inserting the same data (already in memory) into an equivalent hash map. Example:
use serde::{Serialize, Deserialize};
use std::collections::HashMap as Map;
use std::time::Instant;
#[derive(Debug, Serialize, Deserialize)]
struct WordU32Freq1Entry {
idx: u32,
cnt: u32,
}
#[derive(Debug, Serialize, Deserialize)]
struct WordU32Freq1 {
storage: Map<String, WordU32Freq1Entry>,
}
fn main() {
let mut freq = WordU32Freq1 {
storage: Map::default(),
};
for i in 0..6_000_000 {
let entry = WordU32Freq1Entry { idx: i, cnt: i };
freq.storage.insert(i.to_string(), entry);
}
let bytes = bincode::serialize(&freq).unwrap();
assert_eq!(bytes.len(), 136_888_898);
let start_deserialization = Instant::now();
let freq: WordU32Freq1 = bincode::deserialize(&bytes).unwrap();
println!("deserialization: {:?}", start_deserialization.elapsed());
let entries: Vec<_> = freq.storage.into_iter().collect();
let start_insertions = Instant::now();
let mut storage: Map<String, WordU32Freq1Entry> = Map::default();
for (key, value) in entries {
storage.insert(key, value);
}
println!("insertions only: {:?}", start_insertions.elapsed());
}
You should find that those times are almost the same, which means insertion is taking almost all the time.
I would expect around 2-3x faster on this workload (depending on your exact distribution of key sizes) if you don't need resistance to hash collision attacks and can switch to fnv or some other fast hasher.