Transmutting a u32 array to a u8 array


#1

Hey, I’m trying to port some C code where almost all function attributes are void pointers. The thing is that for example, it generates a struct that has some data in it, and then it sends it through USB as a byte stream. I have solved most of these cases by generating an actual byte array and copying values into it using ByteOrder, since I’d like to keep using safe code as far as possible.

What is troubling me is that in one moment it uses a [u32; 256] array and it passes the array to a function accepting void * data. The function I use to represent that uses &mut [u8], which works well with most of my data (and uses the same data type as the libusb calls).

The question is, how do I pass a &mut [u32] to a function that accepts &mut [u8]? Is there a good (and safe, if possible) way to do it? I cannot cast it, since it gives me a non-scalar cast. All the u32 words have been checked for endianness (I use .to_le() with all of them), so It shouldn’t be unsafe, I think, but I cannot extract the bytes from a u32 :confused:


#2

In the Rust extern declaration, you can just change the type from &mut [u8] to &mut [u32].

Otherwise, see https://github.com/briansmith/ring/blob/abb3fdfc08562f3f02e95fb551604a871fd4195e/src/polyfill.rs#L93-L110


#3

Hey, thanks Brian :slight_smile:

I think I didn’t explain myself properly. I’m actually porting all the code, so I have no C dependencies. I need to use a function from libusb that receives a &mut [u8] but I’m creating a u32 array that I would like to pass to that function. Maybe I should simply hand-write the actual bytes, but I would prefer to avoid it if possible (I already have to write 10 u32 constants by hand, so it would be 40 bytes, which could increase my typo probabilities.

I see that you use unsafe code for the change, is that the only way?


#4

This is the only way I know how to do it.

Note that I’m not sure that the language guarantees this will work. C’s type-based anti-aliasing has an exception for casting to an array of (unsigned) char, but Rust’s rules are unclear (to me).


#5

I might be wrong, but as I understand it Rust’s aliasing rules depend entirely on lifetimes.


#6

On these forums, I’ve been told all of:

  • it hasn’t been decided yet
  • it is only based on lifetimes
  • it hasn’t been decided yet, but it would break too much code to make the rules stricter than what the compiler allows now (which is basically lifetimes).

Also, I’m not sure my macro is 100% safe even if the rules are based on lifetimes. It isn’t clear to me how to consume the &mut [u32] at the same time I create the &mut [u8]. It seems like there is a point in time where both mutable references alias the same memory, which I guess is undefined?


#7

I would hope that it doesn’t actually matter unless they’re dereferenced. You’re in good company, at least, because that’s similar to how slice::split_at_mut works too.

If necessary, I suppose you could drop the old slice after grabbing its pointer and length, and only then create the new slice.


#8

I’m not sure which macro you’re referring to. The u32_as_u8_mut function seems correct, as it fixes the lifetimes in the parameter and return type.


#9

It isn’t clear to me how to consume the &mut [u32] at the same time I create the &mut [u8]. It seems like there is a point in time where both mutable references alias the same memory, which I guess is undefined?

The function doesn’t consume the &mut [u32] but it does “re-borrow” from it, which causes it to not be live while the re-borrow is live:

    let mut x = [1, 2, 3];
    let y = &mut x[..];
    {
        let z = u32_as_u8_mut(y);
        // y[0] = 5; // error: cannot assign to `y[..]` because it is borrowed
    }
    y[0] = 5; // `y` is live again

(playground)


#10

I think @briansmith’s point is that there’s a moment within u32_as_u8_mut where both the src parameter and the from_raw_parts_mut return value exist in the same scope. So these are two &mut[] aliasing the same memory. You would be permitted to write to either one, or both, though that would surely have undefined behavior.


#11

Yes, that’s right.

Also, I am aware that split_at_mut() works similarly, but OTOH the standard library isn’t “real” Rust code; it’s written in a similar but different language. This is similar to how the C standard library (usually) isn’t/can’t be written in standard C. This is why I’d prefer to just have this kind of “downcast to slice of a smaller integer type” function in the standard library.

Regarding my reference to a “macro,” I was just confused because some I have some macros that do similar things. I was indeed referring to the function referenced above.

Thanks for explaining the reborrowing. When I wrote the function I did wonder (and still wonder) if that’s the best way to specify the lifetimes, or if there’s a better way.


#12

For instance, this would be misbehaving: (playground)

pub fn u32_as_u8_mut<'a>(src: &'a mut [u32]) -> &'a mut [u8] {
    let dst = unsafe {
        core::slice::from_raw_parts_mut(src.as_mut_ptr() as *mut u8, src.len() * 4)
    };
    src[0] = 1;
    dst[0] = 2;
    dst
}

But if we don’t actually write to either, just return the new slice immediately, are the momentarily-aliased slices ok? In practice, they must be, because split_at_mut works, but what is actually guaranteed?


#13

I’m curious what you mean – does rustc treat the standard library differently somehow?
(besides the obvious RUSTC_BOOTSTRAP access to unstable features)


#14

There’s no reason to think it treats it, or will always treat it, the same. More to the point, just because the standard library does something doesn’t mean non-standard-library code can get away doing the same thing.


#15

Well, not really, the unsafe code guidelines team is figuring out exactly what it means to be unsafe and I’m pretty sure this is intended for generic use outside of std.


#16

Wow, I see that there is some controversy with it. I’ll probably use your approach, @briansmith, or use a u8 array with the Byteorder crate to insert some of the values.


#17

Ah, got it. You could avoid this by forcing src to move, by writing {src}.as_mut_ptr(). This will consume src so it can’t be used after this expression (example).


#18

Thanks for posting that!

I understand, based on the compiler’s messages, that it thinks {src}.as_mut_ptr() and src.as_mut_ptr() are different. But it isn’t clear to me why they are different. In particular, does anybody know what part of the Rust Reference that describes that preventing the re-borrow is correct here?

Also, it isn’t clear to me that the compiler not leting us re-borrow src is the same as consuming src, especially insofar as the language and/or rustc’s aliasing rules are concerned.


#19

Honestly, it’s not totally clear to me either, and you might not want to rely on this for critical safety. Here’s a version whose meaning is clearer and which I’m confident will always consume src:

{ let tmp = src; tmp }.as_mut_ptr()

Here you can clearly see that src is moved into a temporary binding that doesn’t outlive the surrounding statement. The more concise version is apparently equivalent to this, but as I said I’m not confident about the details. In particular, I don’t know why the {src} trick only works using method call syntax; it does not force a move in expressions like foo({src}). On the other hand, the explicit version foo({let tmp = src; tmp}) does still consume src.


#20

@mbrubeck Thanks for the explanation. Like you said, I think it isn’t a great idea to rely on this behavior. In particular, when non-lexical borrow scopes is likely to change things. The fact that things already seem to be inconsistent in the way you point out also is evidence that it probably isn’t something to rely on.

In the end, it’s probably best to make an RFC for putting this functionality, generalized, into libcore. Then whether the implementation is guaranteed to work or not is less of an issue since it would just be an implementation detail of libcore. This is what I’m intending to pursue soon.