From bytes to u32 in C vs Rust

Hi folks,

I am bit stuck on something. I have to read a buffer filled by a FFI C function I need to call. I translated (or tried to) a function from C to Rust in order to convert the bytes into u32. The following C code produces -1886197071:

#include <stdio.h>
#include <stdint.h>

uint32_t slice_to_num(const void* memPtr) { 
    return *(const uint32_t*) memPtr;
}

int main()
{   
    uint8_t a[4] = {177,234,146,143};
    uint32_t n = slice_to_num(a);
    printf("%d\n",n);

    return 0;
}

However, the rust counterpart produces 2408770225:

use std::convert::TryInto;

fn slice_to_num(buff: &[u8]) -> u32 {
    u32::from_ne_bytes(
        buff.try_into().unwrap())
}

fn main() {
    let a = [177,234,146,143];
    let n = slice_to_num(&a);
    println!("{}",n);
}

I do have a pointer to a buffer in the C code, so a is indeed a uint8_t * and these values are actually the values of 4 positions in the buffer. I am not an expert on these kinda stuff, so am I missing something? Did I translated this correctly?

Thanks in advance!

And this means that C treated your number as i32, not as u32. If you request signed value, Rust will output the same thing.

2 Likes

Use printf("%u"... in your C code, not %d

1 Like

Damn, that was total cringe of me. Thanks folks!

1 Like

Note that your C code is UB: you're reading a byte array through a 32-bit-integer pointer, which isn't guaranteed to be aligned, and is also a strict aliasing violation.

Hooray for the Rust version being safe :slightly_smiling_face:

4 Likes

You mean BE and LE? The original code actually does some checks, I simply selected the LE to keep things shorter. But I totally agree with you here, once someone could really make a mistake on those casts.

No, this is about data alignment. A pointer to uint32_t must be a multiple of 4 (at least on most common architectures), while a pointer to uint8_t has no such requirement. Dereferencing a mis-aligned pointer is undefined behavior in both C and unsafe Rust.

Additionally, your C code has undefined behavior because it violates the strict aliasing rule. (Rust has no equivalent rule.)

7 Likes

Oh, I think I understand it now. The code I'm converting to Rust is zstd_seekable. The uint8_t pointer arithmetic is giving me some headaches. At some point they use memcpy to copy a u32 value to a buffer of uint8_t, would it be better to use slices in Rust? I'm trying to use ptr::copy_nonoverlapping to copy the value, but I wonder if there's a much better approach.

Don't blindly copy integers as bytes without defined endianness if the bytes may takes disks or wire. The C code you've shared explicitly use LE to write integer to bytes and you should follow it.

I know some people using their old power mac as their home server and there're companies running sparc machines in their room. With upcomming gcc codegen backend we'll get a lot more BE arch support.

1 Like

They are actually defining as this:

MEM_STATIC void MEM_writeLE32(void* memPtr, U32 val32)
{
    if (MEM_isLittleEndian())
        MEM_write32(memPtr, val32);
    else
        MEM_write32(memPtr, MEM_swap32(val32));
}

You can do this in safe Rust like this:

fn write_le_32(buf: &mut [u8], val: u32) {
    buf[..4].copy_from_slice(&val.to_le_bytes());
}
1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.