Odd code generation on x86_64 when converting u64 to Result<[u8; 8], ()>

I'm using Rust 1.13 and I have code similar to the following in a library

pub fn u64_to_u8(number: u64) -> Result<[u8; 8], ()> {
    use std::mem::transmute;

    let bytes: [u8; 8] = unsafe { transmute(number) };
    Ok(bytes)
}

Which results in the following disassembly

0000000000000000 <_ZN9transtest9u64_to_u817hda109e31aa82d40fE>:
   0:	48 89 f0             	mov    rax,rsi
   3:	49 89 c0             	mov    r8,rax
   6:	49 89 c1             	mov    r9,rax
   9:	48 89 c1             	mov    rcx,rax
   c:	48 89 c2             	mov    rdx,rax
   f:	c6 07 00             	mov    BYTE PTR [rdi],0x0
  12:	88 47 01             	mov    BYTE PTR [rdi+0x1],al
  15:	88 67 02             	mov    BYTE PTR [rdi+0x2],ah
  18:	48 c1 e8 10          	shr    rax,0x10
  1c:	49 c1 e8 18          	shr    r8,0x18
  20:	49 c1 e9 20          	shr    r9,0x20
  24:	48 c1 ee 28          	shr    rsi,0x28
  28:	48 c1 e9 30          	shr    rcx,0x30
  2c:	48 c1 ea 38          	shr    rdx,0x38
  30:	88 47 03             	mov    BYTE PTR [rdi+0x3],al
  33:	44 88 47 04          	mov    BYTE PTR [rdi+0x4],r8b
  37:	44 88 4f 05          	mov    BYTE PTR [rdi+0x5],r9b
  3b:	40 88 77 06          	mov    BYTE PTR [rdi+0x6],sil
  3f:	88 4f 07             	mov    BYTE PTR [rdi+0x7],cl
  42:	88 57 08             	mov    BYTE PTR [rdi+0x8],dl
  45:	48 89 f8             	mov    rax,rdi
  48:	c3                   	ret

When the following code

pub fn u64_to_u8_2(number: u64) -> [u8; 8] {
    use std::mem::transmute;

    let bytes: [u8; 8] = unsafe { transmute(number) };
    bytes
}

Results in the following

0000000000000000 <_ZN9transtest11u64_to_u8_217hc4ea0612fd31159fE>:
   0:	48 89 f8             	mov    rax,rdi
   3:	c3                   	ret    

Which is much closer to what I expected.
Why does the compiler generate multiple single byte writes instead of a single 8 byte write?

How are you compiling your code? Please make sure optimizations are enabled
with cargo build --release.

Sorry, I should have specified. My code was compiled with cargo build --release.

  • Asgeir Bjarni Ingvarsson:

I'm using Rust 1.13 and I have code similar to the following in a library

pub fn u64_to_u8(number: u64) -> Result<[u8; 8], ()> {
>     use std::mem::transmute;
>
>     let bytes: [u8; 8] = unsafe { transmute(number) };
>     Ok(bytes)
> }

One potential explanation is that std::mem::transmute disables
aliasing analysis at the LLVM level, and the Result return object is
represented as something that can be aliased, perhaps with a hidden
incoming pointer argument. That necessarily inhibits many
optimizations.

That sounds possible. However if I compile the following it also generates some strange code

pub fn u64_to_u8_3() -> Result<[u8; 8], ()> {
	Ok([1u8, 2u8, 3u8, 4u8, 5u8, 6u8, 7u8, 8u8])
}

becomes

0000000000000000 <_ZN9transtest11u64_to_u8_317h408174c7c43c5777E>:
   0:	c6 07 00             	mov    BYTE PTR [rdi],0x0
   3:	c6 47 01 01          	mov    BYTE PTR [rdi+0x1],0x1
   7:	c7 47 02 02 03 04 05 	mov    DWORD PTR [rdi+0x2],0x5040302
   e:	66 c7 47 06 06 07    	mov    WORD PTR [rdi+0x6],0x706
  14:	c6 47 08 08          	mov    BYTE PTR [rdi+0x8],0x8
  18:	48 89 f8             	mov    rax,rdi
  1b:	c3                   	ret

This should be a bug report. It's returning a struct by using the pointer for the return slot, and it's copying the 9 bytes (no padding) into place. Original code is very bad. The latest example looks pretty ok, doesn't it?

I would have expected closer to two writes. 1 byte to identify the enum, 7 bytes of padding, and then 8 bytes of payload.

There is no padding because [u8; 8] is 1-aligned. The whole struct (enum Result) is 9 bytes.

I can report it now if you want to.

Thanks, that would be great.

The issue is 38032. I tried to simplify the test case as far as possible, which is why it's using Option (it's the same 9-byte struct really).

1 Like