Why does u64::from_ne_bytes consume input?

Here is the prototype of from_ne_bytes:

pub const fn from_ne_bytes(bytes: [u8; 8]) -> u64

I'm wondering why it was designed to consume the input, instead of taking a reference of bytes?

I felt taking a reference of input will be much easier to use. For example, when bytes is just one part of a bigger array or slice. Although there are ways to create owned bytes from a reference, but it's more complicated.

I don't know why it was designed like that, but I could understand it.
The following points come to mind:

  • in most cases, the reference would probably cost more or same as just getting bytes.
  • if endianess conversion is required, it wouldn't be possible without a copy anyways (n/a for from_ne_bytes...)
  • this could easily be a nop (if the original array is not used anymore)
1 Like

Array's of Copy Types are Copy themselves, so the input is copied not consumed. So a & wouldn't make real difference here.

When loading from slices there's no way to prove at it's exactly 8 bytes. Which is what that complication is doing. If this took in a slice it would have to pay for that check, which some use cases can avoid.

3 Likes

The complication of using a slice is usually just an extra .try_into() call, plus error handling for if the slice isn't the right size.

The tradeoff the from_*e_bytes functions make is that by taking a fixed-sized array, they always succeed, and thus you don't need to do any error handling if you already have the byte array. And since besides error handling, .try_into() is a single function call, it's really not a big increase in complexity compared to if you had a single function from_ne_bytes(&[u8]) -> Result<u64, ...>.

2 Likes

On the other hand, it is paying for the array copy, whereas for a slice it would not. Which is more expensive, copying 8 bytes, or skip the copy and do the runtime check? I think it's difficult to really know without microbenchmarking.

At least on x86_64 that "array copy" should get inlined and turned into an unaligned read directly from the slice after bounds checking (on godbolt). So post-optimization the two should be identical. (Whether one or the other is easier to optimize, :man_shrugging:).

1 Like

The copy happens either way so the performance comparison is between copy vs. check + copy and it is logically impossible for the latter to be faster.

In fact, passing slice reference requires fat pointer copying, which is 16bytes large on 64bit system. I'd not say which would be practically more expensive, but it's not free.

@phlopsi: No, when you borrow the slice, the borrow is copied, not the slice itself. Otherwise, if you have let's say a Vec<u8> of capacity 16 MiB, and you pass a &myvec[..] slice of it to some fn, you'd have to copy all 16 MiB which would defeat the entire purpose of indirection in that case, as you might as well clone the Vec<u8> itself then.

@Hyeonu This is indeed something I hadn't fully considered. In this specific case copying the fat pointer might be more/equally expensive than/as copying the 8 bytes, depending on the precise CPU arch and word width. For example if a CPU natively uses 64-bit words, perhaps they'd be equally expensive assuming that both the fat pointer and the 8 bytes could be packed into 1 word (which may or may not be true).

There is no copy: Compiler Explorer It gets inlined and passing arguments doesn't cost anything.

Rust bets on zero-cost abstractions, so often function interfaces are chosen by their desired semantics, not performance, because the performance difference is abstracted away anyway.

But even if this function wasn't inlined, and for some reason it had to be executed very literally like in a compiler without a modern optimizer, it would still be more expensive to use a pointer:

  • 64-bit pointers are 8 bytes themselves, so you still "copy" 8 bytes to the function.
  • To have a pointer you'd have to have the data in memory. If the bytes were already in a register, they'd have to be written to memory first!
  • Then the function would have to dereference the pointer, and copy the data out to the return register

So naively passing data by reference would copy 2-3 times more data. But Rust isn't designed to work with naive non-optimizing compilers, so keep in mind code actually doesn't do what functions signature suggest. Copy types aren't always copied. Reference types don't always exist as a reference either.

2 Likes

But the function is copying the data in all cases except when it is inlined, because the function has to return a u64. It doesn't create that out of thin air. There is 1 other case I didn't think about and that is when usize < 32 bit, i.e. the array is larger than the pointer.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.