In a codebase where we're trying to keep everything performant (even the smallest part) we also want to be readable by a new in our team.
But some days ago, the following code modification was refused. I don't understand why, and I remain without answers. Maybe the Rust Community can help me to understand? =-) I'm curious to know the reactions out of context!
If you are curious to know what the following code is used for, it's very simple, we have to limit the amount of objects in a Map (we use a Map and not a Queue for the fast access performance). The function describe an insertion of
bar
for eachid
inbar_ids
and we pop those with the less chance to be used when we reach themax_known_bar
(basically the oldest).
I changed the names of the originals variable, so the link betweenbar_ids
andbar
can be strange out of context. But the idea remains that
Before:
pub fn insert_and_tail1(
&mut self,
bar_ids: &[BarId],
val: bool,
bar: Bar,
max_known_bar: usize,
) {
for id in bar_id {
self.known_bars.insert(*id, (val, bar));
}
while self.known_bars.len() > max_known_bar {
// remove the oldest
let (&h, _) = self
.known_bars
.iter()
.min_by_key(|(h, (_, t))| (*t, *h))
.unwrap();
self.known_bars.remove(&h);
}
}
After refactoring:
pub fn insert_and_tail2(
&mut self,
bar_ids: &[BarId],
val: bool,
bar: Bar,
max_known_bar: usize,
) {
for id in bar_ids {
self.known_bars.insert(*id, (val, bar));
}
let len = self.known_bars.len();
if len == max_known_bar + 1 {
/* Fast case where there is just one to pop */
self.pop();
return;
}
let overflow = len - max_known_bar;
if overflow >= len.log(2) as usize {
self.known_bars
.iter()
.map(|(h, (_, t))| (*h, *t))
.sorted()
.take(overflow)
.for_each(|(id, _)| {
self.known_bars.remove(&id);
});
} else { /* In that case the time complexity in more likelly O(N) */
while self.known_bars.len() > max_known_bar {
self.pop()
}
}
}
The second one may be more cryptic and hard to understand. But seems also to be faster with various inputs after some Criterion measures (we have most of the time a length of bar_ids == 1
). Now, you know everything.
Which one do you prefer in your codebase? Don't hesitate to say if both are horribly designed or if you find a better way to do that!