~/hello_cargo$ time cargo build --release
~/hello_cargo$ time ./target/release/hello_cargo
100
real 0m3.384s
user 0m3.296s
sys 0m0.088s
main.rs code:
use std::collections::HashMap;
fn main() {
let mut i = 0;
let mut m = HashMap::new();
while i < 10000000 {
let index = i % 1000000;
let mut key = String::from("a");
key.push_str(&index.to_string());
if let Some(value) = m.get(&key) {
m.insert(key, value + 10);
} else {
m.insert(key, 10);
}
i += 1;
}
match m.get(&String::from("a999999")) {
Some(&number) => println!("{}", number),
_ => println!("Don't have number."),
}
}
main.go code:
package main
import (
"fmt"
"strconv"
)
func main() {
m := make(map[string]int)
for i := 0; i < 10000000; i++ {
index := i % 1000000
m["a"+strconv.Itoa(index)] += 10
}
fmt.Println(m["a999999"])
}
~$ time ./main
100
real 0m2.529s
user 0m2.799s
sys 0m0.064s
Significant performance improvements, thanks. Is there room for improvement in performance?
~/hello_cargo$ time ./target/release/hello_cargo
100
real 0m2.609s
user 0m2.565s
sys 0m0.045s
main.rs code:
use std::collections::HashMap;
fn main() {
let mut m = HashMap::<String, isize>::new();
for i in 0..10000000 {
let index = i % 1000000;(
*m.entry(format!("a{}", index)).or_default()) += 10;
}
println!("{}", m.get("a999999").copied().unwrap_or_default())
}
~/hello_cargo$ time ./target/release/hello_cargo
100
real 0m2.475s
user 0m2.435s
sys 0m0.040s
use std::collections::HashMap;
fn main() {
let mut m = HashMap::<String, isize>::with_capacity(1000000);
for i in 0..10000000 {
let index = i % 1000000;(
*m.entry(format!("a{}", index)).or_default()) += 10;
}
println!("{}", m.get("a999999").copied().unwrap_or_default())
}
~$ time ./main
100
real 0m2.177s
user 0m2.442s
sys 0m0.064s
package main
import (
"fmt"
"strconv"
)
func main() {
m := make(map[string]int, 1000000)
for i := 0; i < 10000000; i++ {
index := i % 1000000
m["a"+strconv.Itoa(index)] += 10
}
fmt.Println(m["a999999"])
}
You might want to try different hashers, as the default secure hasher is not the fastest. The std implementation is built on the hashbrown crate, and if you use that directly you'll get a faster ahash instead. There are more hasher options on crates.io too: fnv, fxhash, xxhash, etc.
use std::fmt::Write;
fn main() {
// Faster hasher
// Use Box<str> to avoid storing capacity
// use a size small enough for the values
let mut m = fxhash::FxHashMap::<Box<str>, u32>::with_capacity_and_hasher(1000000, Default::default());
// Reuse one string allocation across loop iterations
let mut k = String::with_capacity(30);
for i in 0..10000000 {
let index = i % 1000000;
// clear + write! is like an in-place format!
k.clear();
let _ = write!(k, "a{}", index);
// avoids allocating the string 9 out of 10 times
if let Some(v) = m.get_mut(k.as_str()) {
*v += 10;
} else {
m.insert(k.clone().into_boxed_str(), 10);
}
}
println!("{}", m.get("a999999").copied().unwrap_or_default())
}
FWIW, testing with hyperfine on a Skylake machine, the "boxed str" part produces no statistically significant performance improvement, so I would suggest this simplification:
use std::fmt::Write;
fn main() {
let mut m = fxhash::FxHashMap::with_capacity_and_hasher(
1_000_000,
Default::default(),
);
let mut k = String::with_capacity(30);
for i in 0..10_000_000 {
k.clear();
let _ = write!(k, "a{}", i % 1_000_000);
if let Some(value) = m.get_mut(&k) {
*value += 10;
} else {
m.insert(k.clone(), 10);
}
}
println!("{}", m.get("a999999").unwrap_or(&0));
}
On my machine this is 1.55x faster than the Rust original. (Obviously I can't compare my times with the OPs in an apples-to-apples comparison.)
I've tried it on Kaby Lake and saw a significant difference. The working set with Box should be smaller, so whether there's a visible difference it depends on whether the hashmap fits in your cache.
My first thought was โ the data is accessed in ordered way, so let's try (an ordered) BTreeMap instead of hashmap. Turned out it wasn't enough to beat @kornel solution. So I thought โ the strings are always small, let's use some kind of small string optimization โ and grabbed an arraystring crate. Faster, but still not there.
So for a fun experiment decided to increase the dataset size ten times โ BtreeMap solution finished in 46 seconds and hashmap needed 10 minutes. I guess the main conclusion from that is that CPU caches are insanely big these days. But if your Value gets more complex the cache effect would be more visible.
BTreeMap:
use std::collections::BTreeMap;
use std::fmt::Write;
fn main() {
// Since we're accessing elements in ordered way, let's use an ordered collection.
let mut m: BTreeMap<Box<str>, u32> = BTreeMap::new();
// Reuse one string allocation across loop iterations
let mut k = String::with_capacity(30);
for i in 0..10000000 {
let index = i % 1000000;
// clear + write! is like an in-place format!
k.clear();
let _ = write!(k, "a{}", index);
// avoids allocating the string 9 out of 10 times
if let Some(v) = m.get_mut(k.as_str()) {
*v += 10;
} else {
m.insert(k.clone().into_boxed_str(), 10);
}
}
println!("{}", m.get("a999999").copied().unwrap_or_default())
}
BTreeMap + arraystring
use std::collections::BTreeMap;
use std::fmt::Write;
use arraystring::{ArrayString, typenum::U15};
fn main() {
// Since we're accessing elements in ordered way, let's use an ordered collection.
let mut m: BTreeMap<ArrayString<U15>, u32> = BTreeMap::new();
for i in 0..10000000 {
let index = i % 1000000;
let mut k = ArrayString::new();
k.clear();
write!(k, "a{}", index).unwrap();
*m.entry(k).or_default() += 10;
}
println!("{}", m.get("a999999").copied().unwrap_or_default())
}
Then I decided to try rolling my own specialized arraystring. Seems to beat kornel's solution slightly. But hashmap also gets faster with custom arraystring, so that leaves a tie, at least on my CPU. Gist