Reading Integers from a Slice Efficiently

kamulos · February 20, 2020, 10:06am

I am trying to convert a program from C code to Rust. A lot of the program consists of creating and reading buffers, which are mostly little endian integers.

I am a little bit sensitive regarding performance, so I did the following experiment: Compiler Explorer

The unsafe version obviously is efficient, but it really looks horrible in code and I want to avoid unsafe as much as possible. Why does the Rust compiler create 4 checks in the safe version and how can I avoid it?

The code in the compiler explorer:

pub fn read_u32(data: &[u8], pos: usize) -> u32 {
    return (data[pos] as u32) + ((data[pos + 1] as u32) << 8) + ((data[pos + 2] as u32) << 16) + ((data[pos + 3] as u32) << 24);
}

pub fn read_u32_unsafe(data: &[u8], pos: usize) -> u32 {
    if pos + 4 > data.len() {
        panic!();
    }
    unsafe {
        let dptr = data.as_ptr();
        return (*dptr.offset(pos as isize) as u32) + ((*dptr.offset(pos as isize + 1) as u32) << 8) + ((*dptr.offset(pos as isize + 2) as u32) << 16) + ((*dptr.offset(pos as isize + 3) as u32) << 24);
    }
}

Hyeonu · February 20, 2020, 10:34am

byteorder crate is the standard crate for it.

pub fn read_u32(data: &[u8], pos: usize) -> Option<u32> {
    use byteorder::{ReadBytesExt, LE};

    data.get(pos..)?.read_u32::<LE>().ok()
}

alice · February 20, 2020, 10:35am

This has only two checks:

pub fn read_u32(data: &[u8], pos: usize) -> u32 {
    let mut arr = [0; 4];
    arr.copy_from_slice(&data[pos..pos+4]);
    return u32::from_le_bytes(arr);
}

This has only one check:

pub fn read_u32(data: &[u8], pos: usize) -> u32 {
    if pos + 4 < pos {
        unsafe {
            std::hint::unreachable_unchecked()
        }
    }
    let mut arr = [0; 4];
    arr.copy_from_slice(&data[pos..pos+4]);
    return u32::from_le_bytes(arr);
}

Basically if pos + 4 overflows the first will notice.

Hyeonu · February 20, 2020, 12:24pm

Note that the latter function should be marked as unsafe, as it literally invokes UB if the pos is usize::max_value().

And also the totally safer version of read_u32 seems already faster than OP's read_u32_unsafe. I don't know why it generates the call instr though.

scottmcm · February 20, 2020, 8:09pm

That code is unsound, because it allows me to trigger UB from safe code. Don't do that.

The safe version is the right choice; both checks are fundamental.

That said, one can add preconditions that subsume some of the checks, which can sometimes help them optimize out when used. For example, you could do this:

pub fn read_u32(data: &[u8], pos: usize) -> u32 {
    assert!(pos <= isize::max_value() as usize, "an index that large is never valid");
    let mut arr = [0; 4];
    arr.copy_from_slice(&data[pos..pos+4]);
    return u32::from_le_bytes(arr);
}

That way there's still only two checks, because LLVM knows that adding 4 to something that passed the assert never overflows, so it removes the second slice_index_len_fail possibility. And it might be more obvious, once inlined, that the pos passed in is trivially smaller than the limit, letting it remove that check too.

DoumanAsh · February 21, 2020, 1:11pm

Simple unsafe version https://github.com/DoumanAsh/lazy-bytes-cast/blob/master/src/slice.rs#L21

scottmcm · February 21, 2020, 9:31pm

That code is also missing safety conditions. Specifically,

#[inline]
///Get reference to the value from slice
///
///This function is UB if `slice.len() < mem::size_of::<T>()`
pub unsafe fn as_type_unchecked<T: Sized>(slice: &[u8]) -> &T {
    &*(slice.as_ptr() as *const T)
}

That's not checking alignment, so should document that it's UB to call if mem::align_if::<T>() > 1 and the slice.as_ptr() is not aligned to that larger amount.

Demonstration that it performs UB when when I meet the slice-long-enough condition mentioned in the comment:

DoumanAsh · February 21, 2020, 10:17pm

It is a problem for structs but I don't think it is a big deal when it comes to integers though.

Phlopsi · February 21, 2020, 10:35pm

It is a problem for integers. Anything aside from u/i8 has an alignment greater than 1. The error message in @scottmcm's example even showcases this.

DoumanAsh · February 21, 2020, 10:54pm

I mean practically speaking.
Theoretically any UB is bad

TomP · February 22, 2020, 12:20am

Practically any UB is bad, because the mere existence of UB in your code gives the compiler backend, LLVM, free license to arbitrarily scramble your code in the quest for optimization, and to do so differently any time either your code, an included library, or the compiler itself has changed. If you don't want your code to work reliably now or in the future, go for UB; otherwise avoid it like the plague.

DoumanAsh · February 22, 2020, 6:45am

The problem with Rust is that it is bad at defining what is UB.
In particular I do know that strict aliasing rule violation might be a problem to Cang, but not in case of LLVM IR
Therefore in this particular case I doubtful of actual code being unsafe.

HadrienG · February 22, 2020, 8:45am

While it is true that some areas of Rust UB need clarification, this particular case is pretty clear-cut, as it is prominently featured at the top of the UB list of the Rust reference: Behavior considered undefined - The Rust Reference .

Also, note that Rust's aliasing rules are quite different from the strict aliasing rules of C. In particular, it is not and will never be UB to use a T* and U* pointing to the same memory region in Rust, which is a huge relief for bit hacking fans What Rust cares about (while C doesn't) is that you never have an &mut T and another reference live at the same time, and Stacked Borrows is a pretty well-advanced attempt at defining what that means exactly.

Lokathor · February 26, 2020, 4:42am

Hey all: the bytemuck crate is where you should go for safely casting data types around. It even lets you cast slices! Get yours today.

scottmcm · February 26, 2020, 8:30pm

No, this is very definitely UB. If you want to read from an unaligned pointer, use ptr::read_unaligned instead.

The whole danger of UB is that is sometimes appears to work; that's the whole reason that unsafe giving you a subset of code to audit is valuable. (I agree that this is one of the things that people often think is fine. Historically I've seen them disabused of this notion most often when they need to compile for ARM as well, though I don't know whether that still hardware faults the way it once did. Just do it right; it's not any slower.)

johannesvollmer · February 28, 2020, 11:09am

Which leaves me wondering why this is not the default behavior. Anyone know why this even is a thing in 2020? It's not like the majority of users are programming hardware from 1987, right?

scottmcm · February 28, 2020, 7:56pm

Because it's platform dependent. The implied part of that statement was if the UB appears to work. LLVM knows that if you call ptr::read_unaligned to read an i32 when compiling for x86, it can just emit a normal load instruction, because x86 allows that. But on other platforms, or for different types, it can't, and needs to emit different code.

I'm not willing to have every pointer read -- including, say, array indexing -- use the slower instruction just because someone might want to read unaligned sometimes.

True, but irrelevant.

Alignment is still required by fairly-modern ARM hardware, because it's way simpler in silicon, and thus can be done for lower power. These days my stuff is all x64, but as of about 2015 I had firsthand experience of code that needed fixing for this problem to run on new ARM processors.

TomP · February 28, 2020, 8:43pm

The RISC-V ISA specifications permit subsets that omit support for misaligned memory accesses, so this issue may not go away.

johannesvollmer · March 1, 2020, 4:22pm

yeah but why are the rust functions not called read and read_aligned? The read function would of course emit a simple read on systems that do not require alignment

alice · March 1, 2020, 4:23pm

Why would you use the longer name for the thing that is much more common?

Topic		Replies	Views
Confused about why this unsafe code works help	8	704	April 22, 2023
From bytes to u32 in C vs Rust	11	6108	October 23, 2021
Can I make this high speed output safe? help	6	550	November 25, 2023
Any safety/provenance difference between ways of casting byteslice to integer?	5	464	July 22, 2022
Any_to_u8 O(1) time?	26	1632	January 12, 2023

Reading Integers from a Slice Efficiently

Related topics