How bad is it 4-byte/8-byte values are not properly aligned?

Suppose we have

data: &[u8]

and x: f32 being data[1,2,3,4] and y: i32 being data[6, 7, 8, 9]

So now, my question is: by the time we get x: f32 and y: i32, they've already been put into registers and aligned right? So why do we care that they ere not properly aligned in data ?

Where is the performance hit? Is it the conversion to x and to y ?

If you want to read a doubleword (four bytes) from memory, it must be aligned if you want it to be fast. It doesn't matter what those bytes represent or what you do with them afterwards. Unaligned reads either require several instructions to implement, or, in the case of x86/x64, take longer to execute (although apparently on current microarchitechtures the difference is rather small). Furthermore, with an unaligned read there's the risk of straddling a cache line or a page boundary, both of which are really bad for performance.

1 Like

In general, if you obtain that f32 by first reading an [u8; 4] from the array, and then converting it to a float, it's all good. If you obtain it by casting the pointer to an &f32 reference, then reading from it, that's undefined behavior.

3 Likes

If you use ptr::read_unaligned then it's technically fine. If you transmute to &[f32] then it's UB and it invites LLVM to miscompile your code.

From perf perspective, it used to be pretty bad on x86, but latest CPUs handle it with almost no performance penalty. However, there are architectures where unaligned access is illegal and makes programs crash.

5 Likes

To add to that last point, ptr::read_unaligned is of course still fine on those architectures. It just might compile to something like reading each byte separately and combining them with bit operations, as on Armv7 Android.

4 Likes