Usize as &[u8], Vec<f32> as &[u8]

I'm looking to use the write function from https://doc.rust-lang.org/std/io/trait.Write.html

This only takes &[u8] as an arg.

what is the simpliest way to 'view' a usize / Vec<f32> as a &[u8] ?

You can use usize::to_ne_bytes to get the native-endian bytes of a usize. (There are also variants that convert to big- or little-endian.)

For the Vec<f32> you can use a function like this:

fn as_bytes(v: &[f32]) -> &[u8] {
    unsafe {
        std::slice::from_raw_parts(
            v.as_ptr() as *const u8,
            v.len() * std::mem::size_of::<f32>())
    }
}
5 Likes

If I may ask for one more favor -- how would you convert a &[u8] or Vec<u8> of len n * std::mem::size_of::<f32>() to a Vec<f32> ?

If I could suffer the overhead of a copy, then I would probably use byteorder, which exposes safe APIs to do your conversion in both directions. See ByteOrder::read_f32_into for &[u8] -> &[f32] and ByteOrder::write_f32_into for &[f32] -> &[u8]. This will also handle endianness for you, if that's a concern. (If it's not, you can use native endian.)

If you need a zero cost way of going from a &[u8] to a &[f32], then it's pretty much the same as what @mbrubeck provided, but in reverse. However, you also need to account for alignment since &[f32] has a higher alignment requirement than &[u8]. So something like this:

fn as_bytes(v: &[u8]) -> &[f32] {
    assert_eq!(v.len() % 4, 0);
    assert_eq!(v.as_ptr() as usize % std::mem::align_of::<f32>(), 0);
    unsafe {
        std::slice::from_raw_parts(
            v.as_ptr() as *const f32,
            v.len() / 4,
        )
    }
}

Now if you need to do Vec<u8> <-> Vec<f32> then that is trickier. e.g., If you start with a Vec<u8> and convert that to a Vec<f32> (assuming length and alignment are correct, as above) but then allowed that Vec<f32> to be deallocated, then you might wind up with UB because you'll be deallocating an allocation that was made with a different alignment. The same would hold in the reverse direction. It is much safer to do Vec<u8> -> &[f32] or Vec<f32> -> &[u8].

4 Likes

ByteOrder is really nice. The data is coming from / going to disk. IO will probably dwarf the in memory copy.

You may also want to look at [T]::align_to. It has the following signature:

pub unsafe fn align_to<U>(&self) -> (&[T], &[U], &[T]);

It takes your data of T and partitions it into "pre-U", "valid U", and "post-U":

Index:  0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
Data:   ?? ?? 24 03 FF EE AA 11 22 33 44 55 66 77 ?? ??
Result:       β”œβ”€β”€β”€β”€ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”œβ”€β”€β”€β”€
Type:         &[u8]         &[f32]          &[u8]

So this code:

let my_data: &[u8] = &[
  0x00, 0x00, //These are padding bytes as ??s
  0x24, 0x03, 0xFF, 0xEE, 0xAA, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77,
  0x00, 0x00, //??
];
let (l, m, r) = unsafe {my_data[2..=0xD].align_to::<f32>()};
assert_eq!(l, [0x24, 0x03]);
assert_eq!(m, [
    f32::from_ne_bytes([0xFF, 0xEE, 0xAA, 0x11]), 
    f32::from_ne_bytes([0x22, 0x33, 0x44, 0x55])
]);
assert_eq!(r, [0x66, 0x77]);
1 Like

So the middle is the longest chunk where start = multiple of alignment, end+1 = multiple of alignment, and the front/end are the remaining parts?

Yes... it aligns data to the alignment of the other type U. l is misaligned to be an f32 since it needs to be aligned to four bytes (Coincidentally I know this, don't leverage previous knowledge like this in a production scenario and always use std::mem::align_of::<T>()). m is properly aligned and can be read as f32, and finally r is what's left over since it's too short to be another f32.

Great answer. I assume if the required assertions fail, the conversion won’t proceed.

A. For the pointer alignment of f32 with the existing u8, in what scenarios might the larger pointer not β€œdecide” (If you will) to start exactly where the smaller u8 starts?

B. To avoid the scenario where the u8 memory block length is not a multiple of 4, is there a way to write the u8 block I n such a way that it is? (i.e., anticipate the need)?

C. On a 32 or 64 bit platform, would either of the assertions ever fail? (Something we cant assume for embedded chips but otherwise yes)

Thanks for elucidating if you can.

- E

Yes.

I'm not quite sure I understand your question here. If slice is a &[u8] and is aligned to a 4 byte boundary and has a length that is a multiple of 4, then it can be converted to a &[f32] without cost. But, for example, &slice[1..] could not, since it is not correctly aligned (nor does it have a length that is a multiple of 4).

Usually you allocate it that way. Ideally, you would just create the allocation initially as a Vec<f32>. You could also do a raw memory allocation and specify the desired alignment. But yeah, otherwise, it's pretty situation specific. There is no general approach that works in all contexts.

Oh of course. As I mentioned above, if slice passed the assertions than &slice[1..] would not.

Your answers were helpful.

Only in regards to the pointer alignment (not length), will it always be true that &slice[0..] (type u8) aligns with the converted f32 pointer? I think the answer is a resounding, yes, but wanted to confirm.
...If so, then we only need to make sure the starting index of the slice is zero or some multiple of 4.

No, definitely not. I mean, trivially not:

let slice: &[u8] = &[1, 2, 3, 4];
assert_eq!(slice.as_ptr() as usize % std::mem::align_of::<f32>(), 0);

let xs = &slice[1..];
assert_eq!(xs.as_ptr() as usize % std::mem::align_of::<f32>(), 0);

In the above example, if the first assert passes, then you're guaranteed that the second assert will not. Moreover, there is no guarantee that the first assert will pass in the first place. If you stack allocate a slice of &[u8], then the only guarantee you have is that its alignment will be 1 byte.

It is easy to demonstrate (although the behavior exhibited by this program is itself not guaranteed at all by the language, AFAIK): https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3b148c1f9469f88c1402a06d0126f3fd

1 Like

Thank you! I made a mistake in my question, I was referring to whether the start of the memory block would be aligned as long as we start with index of 0, or a multiple of 4 thereof. Does my statement that it is ok to assume alignment of the start of the memory hold with this clarification?

No. My previous post's playground link should clarify that this is not the case?

Otherwise, I would recommend sharing a code example expressing your question. I think that will be clearer.

Thank you. I will create an example to show you what I mean later today/evening. I appreciate your patience all-round!

If by "start of the memory block," you mean the start of a heap allocation, then Rust doesn't make any guarantees but specific allocators do. For example, malloc always returns an allocation that is "suitably aligned for any built-in type."

If you know for certain which allocator(s) your code will use, then you can make some assumptions about the alignment of any heap allocation.

Here is my example. I believe all of the memory is created on the stack because we are using slices and known size at compile time. @mbrubeck is that not true?

In the first series of printouts I ask if the memory will be aligned if we start at index 0. The question is, is that always the case if we start at 0?

In the second series of printouts, the memory is aligned at index 4. If the first is always true, then I can rely on this second and any multiple of 4 also being true.

fn main() {
    let slice_u8: &[u8] = &[1, 2, 3, 4, 5];
    let ptr_u8 = slice_u8.as_ptr();

    let slice_f32 = as_bytes(slice_u8);
    let ptr_f32 = slice_f32.as_ptr();

    println!("Is the memory aligned if we start from idx 0?");
    println!("ptr_u8:  {:?}", ptr_u8);
    println!("ptr_f32: {:?}", ptr_f32);

    // still aligned
    let slice_u8 = &slice_u8[4..];
    let ptr_u8 = slice_u8.as_ptr();

    let slice_f32 = as_bytes(slice_u8);
    let ptr_f32 = slice_f32.as_ptr();

    println!("Does the memory remain aligned if we start at index 4?");
    println!("ptr_u8:  {:?}", ptr_u8);
    println!("ptr_f32: {:?}", ptr_f32);
}

fn as_bytes(v: &[u8]) -> &[f32] {
    assert_eq!(v.len() % 4, 0);
    assert_eq!(v.as_ptr() as usize % std::mem::align_of::<f32>(), 0);
    unsafe { std::slice::from_raw_parts(v.as_ptr() as *const f32, v.len() / 4) }
}

// outputs...
Is the memory aligned if we start from idx 0?
ptr_u8:  0x1011a5a20
ptr_f32: 0x1011a5a20

Does the memory remain aligned at if we start at index 4?
ptr_u8:  0x1011a5a24
ptr_f32: 0x1011a5a24

- E

1 Like

No. My example above should refute that. Could you say more about why it doesn't?

Stack arrays aren’t guaranteed to start at alignments suitable for all builtin types. The starting address of the array in your example changes depending on what other variables are in the same stack frame (and the optimization level, and possibly other factors):

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2c4806e7d1d371ca0637065441ec2212