I was trying to determine if a simple for loop is faster than the map functionality. I used AOC 2023 day 01 for this and criterion to measure speed.
I was flabbergasted with the results:
Test 1: For-Loop: has a runtime X which is 100% as baseline.
Test 2: map: runs 10% slower (that is maybe expected)
Test 3: Same For-Loop runs 10% slower when an extra function using map present.
It seems like the for loop is automatically rewritten to the map logic.
You can see this in the godbolt output for function 'process' when you remove the function 'process_map'. It is compiled to very different assembly.
In my simple mind there should not be a difference in Test 1 and 3, as the exact same code is used.
Looking at godbolt assembly output, with compiler arguments -C opt-level=3, the same function compiles to different code IF another function is present. The for loop now is enhanced with map functionality while that is not part of the coding.
Questions:
a) How is this possible?
b) Can I prevent it?
// This uses a simple for loop instead of map: Assembly does not have 'map' if function process_map is not part of the coding.
pub fn process(input: &str) -> String {
let mut result_sum: u32 = 0;
for line in input.lines() {
// find first digit in line
let mut value: u8 = 0;
for byte in line.bytes() {
// assume ASCII
if byte <= b'9' && byte >= b'1' {
value = (byte - b'0') * 10;
break;
}
}
// find last digit in line
// since digit was found, no more validation
for byte in line.bytes().rev() {
// assume ASCII
if byte <= b'9' && byte >= b'1' {
value += byte - b'0';
break;
}
}
result_sum += value as u32;
}
// dbg!(result_sum);
result_sum.to_string()
}
pub fn process_map(input: &str) -> String {
let result_sum: u32 = input.lines()
.map(|line| {
let mut value: u8 = 0;
for byte in line.bytes() {
// assume ASCII
if byte <= b'9' && byte >= b'1' {
value = (byte - b'0') * 10;
break;
}
}
// find last digit in line
// since digit was found, no more validation
for byte in line.bytes().rev() {
// assume ASCII
if byte <= b'9' && byte >= b'1' {
value += byte - b'0';
break;
}
}
value as u32
})
.sum();
// dbg!(result_sum);
result_sum.to_string()
}