It's not a question about Rust per se, but I hope someone will be able to help me.
Division instructions are relatively slow, so compilers commonly replace integer division by a constant with multiplication and shifts as can be seen here, i.e. it replaces val % 13111
with val - 1311 * (val * 1341788273) / 2^44
. It works because 2^44/13111
is equal to 1341788272.78..
and both expressions produce the same result accounting for rounding down integer division. But it's not that trivial, since the alternative expression should produce the same result for all possible val
s, e.g. division by 11111
looks like this.
In my code I have roughly this situation:
struct Buf {
// Length of `buf` does not change and it's smaller than 2^32
buf: Vec<Entry>,
}
fn scan_buf(buf: &Buf, mut pos: u64) {
loop {
let idx = (pos as usize) % buf.buf.len();
if process_entry(&buf.buf[idx]) {
break;
}
pos += 1;
}
}
I would like to replace the division with a sequence of multiplications and shifts similarly to what compiler does for constants. Since the buf
field does not change after struct initialization and I loop a lot over it, it's worth to pay additional runtime cost during Buf
initialization to improve performance of the hot loop.
Maybe there is a Rust crate which implements this optimization? If not, could you point me toward a full description of the optimization which accounts for all potential corner cases so I cuold implement it myself?