# A Question on Performance

#1

While working on the nucleotide count exercise in Exercism, I noticed that there were two types of solutions.

In Option 1, the counting logic lives in the `count` method and the `nucleotide_counts` method calls `count` 4 times, once for each nucleotide.

``````use std::collections::HashMap;

pub fn count(to_find: char, strand: &str) -> usize {
strand.chars().filter(|&c| c == to_find).count()
}

pub fn nucleotide_counts(strand: &str) -> HashMap<char, usize> {
let mut h = HashMap::new();
for acid in "ATCG".chars() {
h.insert(acid, count(acid, strand));
}
h
}
``````

In Option 2, the counting logic lives in `nucleotide_counts` and `count` uses `nucleotide_counts` to find the count for a single nucleotide. The argument given for this approach is that `nucleotide_counts` only has to traverse the entire strand once, so it is therefore faster.

``````use std::collections::HashMap;
use std::collections::hash_map::Entry;

pub fn count(to_find: char, strand: &str) -> usize {
if let Some(x) = nucleotide_counts(&strand).get(&to_find) {
*x
} else {
0
}
}

pub fn nucleotide_counts(strand: &str) -> HashMap<char, usize> {
let mut h: HashMap<char, usize> = "ACGT".chars().map(|c| (c, 0)).collect();
for acid in strand.chars() {
match h.entry(acid) {
Entry::Occupied(entry) => {
*entry.into_mut() += 1;
}
Entry::Vacant(_) => {}
}
}
h
}
``````

But none of the tests really exercise performance. The longest strand provided is 70 characters. So I changed my test suite to test `nucleotide_counts` using a 10-million character strand.

I then ran the test against both implementations with `time cargo test` a few times.

Option 1, which should traverse the string 4 times, takes about 3.1-3.3 seconds. Option 2, which should only traverse the string once, takes 7.4-8.0 seconds.

Any ideas why Option 2, which seems more performant, is slower?

#2

`cargo test` runs a debug build, which means compilation without optimizations turned on.

You can either reformulate the test as a program and use `cargo build --release` or add this to your Cargo.toml to cause tests to run with optimizations on:

``````[profile.text]
opt-level = 3
``````

#3

HashMap lookups are not particularly cheap. For a map that only ever has four keys, all of them known beforehand, you shouldnâ€™t use a HashMap in any case.

#4

Thanks for the suggestions, but thatâ€™s not the question Iâ€™m asking. I know there are faster ways to do this. What Iâ€™m wondering is why Option 1 is faster than Option 2. They are both running as tests, so the debug build should affect their performance equally. They are both using HashMaps, so that affects them equally.

#5

Thatâ€™s not true: in option 1, youâ€™re doing exactly 4 hashmap operations. In option 2, youâ€™re doing 10 million.

#6

Ah, I see your point now.

Interestingly, weâ€™ve also experimented with replacing HashMap with BTreeMap. After doing so the Option 2 approach drops down to about 4.7 seconds, so clearly it helps. But as you say, 4 lookups is going to beat 10 million.

So doing a `filter` over 10 million characters 4 times (as I do in Option 1) is going to be faster than doing 10 million lookups once. When I think of it that way it makes more sense.

#7

Of course, the fastest thing to do should be to traverse the string once, but only perform 4 map operations: Iâ€™d be interested how my implementation compares: http://exercism.io/submissions/3a68dd0d5a5c423f857cc7bde792c550

#8

Is it ok to notice a rater silly thing? If char is one of those letters, then (((char as u8) / 2) & 3) maps ATCG to 0, 2, 1, 3. So the map can be replaced by a 4-value array, indexed by `(((c as u8)/2) & 3) as usize`.

Also, if the strand is guaranteed to contain only the characters A, T, C, and G, then utf8 decoding should not be needed, and a few milliseconds might be gained by calling `strand.bytes()` rather than `strand.chars()`.

Example code:

#9

To be fastest, youâ€™d first convert any ATCG strings to a Vec with the acids mapped to 0â€¦3. Then instead of a HashMap, use [usize; 4] as the â€śmapâ€ť.

#10

1.05 seconds.