[u8; 8] to two [u8; 4]

zeroexcuses · December 11, 2021, 11:30am

Besides copying the u8's one by one, is there a way to split a [u8; 8] into two [u8; 4] ?

Context: using u32::read_le_bytes, which for some reason, wants an array instead of a slice.

bluss · December 11, 2021, 11:49am

The example for from_le_bytes shows the TryFrom/TryInto based conversion route: u32 - Rust

zeroexcuses · December 11, 2021, 11:52am

I'm trying to avoid unwrap() in my code. I don't understand why it needs to use TryFrom/TryInto.

bluss · December 11, 2021, 11:54am

It doesn't need unwrap, but I'm not sure if there is a way in std/core to do this without a fallible conversion (that won't be able to fail in your case if you write it neatly). There are certainly alternatives in the ecosystem.

birkenfeld · December 11, 2021, 11:56am

Because slices don't encode their length in the type. So to go from &[T] to [T; 4] can fail.

On nightly they've started adding methods to arrays that use const generics to get sub-arrays with guaranteed lengths; currently there are methods called split_array_ref and split_array_mut.

However, they currently don't give you [T; N] -> ([T; M], [T; N-M]) because (I assume) const generics are not powerful enough yet.

bluss · December 11, 2021, 12:02pm

I think we need further advances in const generics, and then non-fallible conversions can be implemented more easily. As it is now it either has to be type checked at runtime (incorrect array lengths panicking at runtime) or the trait needs to encode the type/array size correspondances ad hoc.

it's certainly possible to define a safe function that takes [T; 8] and returns [[T; 4]; 2] today. It's unfortunately too special case to be in libstd, since it can't be written with const generics for all compatible sizes.

Another example is crate bytemuck which allows this conversion with its cast function - for arrays of u8. Just like split_array_ref, it panics on "runtime type checking" i.e mismatching array lengths, unfortunately. A side effect of being defined in general terms.

bluss · December 11, 2021, 12:15pm

Now this is just mostly my curiosity, but with array/slice patterns we have much better conversions, they are just not, unfortunately, generic in the size of the array.

These are "better conversions" since they are type checked (size of array is checked at compile time) and will fail to compile instead of panic if there is a problem.

(playground link)

The gist of it:

/// Take first four elements of an array
macro_rules! take4 {
    ($array:expr) => {
        match $array {
            [a, b, c, d, ..] => [a, b, c, d]
        }
    }
}

let data = [1, 2, 3, 4, 5, 6, 7, 8];    
u32::from_le_bytes(take4!(data))

Michael-F-Bryan · December 11, 2021, 12:36pm

In theory, the standard library should be able to expose a function/method which does the split using const generics and pointer math. The implementation is trivial.

(I had to introduce my own Array type because I can't directly add methods to the builtin array type)

/// A newtype around an array so we can give it methods.
struct Array<T, const LEN: usize>([T; LEN]);

impl<T, const LEN: usize> Array<T, LEN> {
    pub fn split_at<const INDEX: usize>(&self) -> (&[T; INDEX], &[T; LEN - INDEX]) {
        // Safety: const generics ensure our bounds checks are correct, and the
        // function signature makes sure we don't accidentally transmute lifetimes
        // incorrectly.
        //
        // We can also assume ptr.add() doesn't wrap around because otherwise
        // you wouldn't be able to get a reference to the last element in this
        // array normally.
        unsafe {
            let ptr = self.0.as_ptr();
            let head = ptr.cast::<[T; INDEX]>();
            let tail = ptr.add(INDEX).cast::<[T; LEN - INDEX]>();
            (&*head, &*tail)
        }
    }
}

(playground)

You would then use it like this:

fn main() {
    let array = Array([0_u8; 8]);

    let (first_half, second_half) = array.split_at::<4>();
    assert_eq!(first_half.len(), 4);
    assert_eq!(second_half.len(), 4);

    println!(
        "{}, {}",
        u32::from_le_bytes(*first_half),
        u32::from_le_bytes(*second_half),
    );

    // let _ = array.split_at::<10>(); // compile error
}

As a bonus, we get bounds checking at compile time!

error[E0080]: evaluation of `Array::<u8, 8_usize>::split_at::<10_usize>::{constant#1}` failed
 --> src/main.rs:7:70
  |
7 |     pub fn split_at<const INDEX: usize>(&self) -> (&[T; INDEX], &[T; LEN - INDEX]) {
  |                                                                      ^^^^^^^^^^^ attempt to compute `8_usize - 10_usize`, which would overflow

birkenfeld · December 11, 2021, 12:58pm

If that compiles today, you should propose it on the tracker/as an RFC to replace the current split_array variants, it's strictly more powerful.

krtab · December 11, 2021, 1:07pm

fn convert(x: [u8;8]) -> [[u8;4];2] {
    unsafe {std::mem::transmute(x)}
}

if you want a tuple:

fn convert(x: [u8;8]) -> ([u8;4],[u8;4]) {
    let arr : [[u8;4];2] = unsafe {std::mem::transmute(x)};
    (arr[0],arr[1])
}

both are sound.

birkenfeld · December 11, 2021, 1:08pm

That's your claim.

bluss · December 11, 2021, 1:15pm

The advancement we need is #![feature(generic_const_exprs)] (or a subset of it).

These are implemented by bytemuck too (as mentioned before), but type (size) checked at runtime, so one could just as well just wrap bytemuck::cast.

DoumanAsh · December 11, 2021, 1:37pm

mem::transmute

Michael-F-Bryan · December 11, 2021, 2:10pm

The [[u8;4];2] case is sound because of the way arrays are laid out in memory.

The elements in an array, [T; N], will be laid out sequentially with no padding when size_of::<T>() == align_of::<T>(), so it's fine to transmute it that way. See Arrays and Slices in the unsafe code guidelines for more.

The tuple case isn't sound because a tuple is represented as something like this:

#[repr(Rust)]
struct Tuple<A, B> {
  first: A,
  second: B,
}

Because they are #[repr(Rust)] you can't make any assumptions about layout, including that the 0'th element in the tuple will be first in memory. See Tuple Types in the unsafe code guidelines for more.

If you want to return a tuple, you would need something like this:

fn split<T, const LEN: usize, const INDEX: usize>(
    array: [T; LEN],
) -> ([T; INDEX], [T; LEN - INDEX]) {

    #[repr(packed)]
    struct Tuple<T, const LEN: usize, const INDEX: usize>
    where
        [(); LEN - INDEX]:,
    {
        first: [T; INDEX],
        second: [T; LEN - INDEX],
    }

    unsafe {
        let Tuple { first, second }: Tuple<T, LEN, INDEX> = std::mem::transmute(array);
        (first, second)
    }
}

(playground)

That was actually my first attempt, but it's not great because a) you need to carry the T and LEN generic parameters around so using turbofish for the INDEX parameter gets a bit awkward (e.g. split::<_, _, 4>([0_u8; 8])), and b) it doesn't compile because generic_const_exprs is incomplete and the compiler thinks the array and Tuple<T, LEN, INDEX> types have different sizes ("dependently-sized types" is the bit to look out for).

warning: the feature `generic_const_exprs` is incomplete and may not be safe to use and/or cause compiler crashes
 --> src/lib.rs:1:12
  |
1 | #![feature(generic_const_exprs)]
  |            ^^^^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(incomplete_features)]` on by default
  = note: see issue #76560 <https://github.com/rust-lang/rust/issues/76560> for more information

error[E0512]: cannot transmute between types of different sizes, or dependently-sized types
  --> src/lib.rs:17:61
   |
17 |         let Tuple { first, second }: Tuple<T, LEN, INDEX> = std::mem::transmute(array);
   |                                                             ^^^^^^^^^^^^^^^^^^^
   |
   = note: source type: `[T; LEN]` (this type does not have a fixed size)
   = note: target type: `Tuple<T, LEN, INDEX>` (size can vary because of [T; INDEX])

birkenfeld · December 11, 2021, 2:13pm

Actually the tuple case is also sound because it's still transmuting to a [[u8; 4]; 2] and then making a tuple out of it.

I wasn't saying it is all unsound, I was hinting that just saying "It is sound" shouldn't be enough to justify transmute, @krtab should have quoted exactly what you did now.

Michael-F-Bryan · December 11, 2021, 2:15pm

Haha, looks like I got caught not properly reading the code in question

I read it as a direct transmute from an array to a tuple.

krtab · December 11, 2021, 2:17pm

I wasn't saying it is all unsound, I was hinting that just saying "It is sound" shouldn't be enough to justify transmute , @krtab should have quoted exactly what you did now.

I disagree that it is your place to judge the amount of work I owe to put into a response that was relevant to the OP's need. Next time, feel free to do yourself the (very welcome) work that @Michael-F-Bryan did.

alice · December 11, 2021, 2:25pm

Seriously, just do this:

let arr1 = [arr[0], arr[1], arr[2], arr[3]];
let arr2 = [arr[4], arr[5], arr[6], arr[7]];

scottmcm · December 11, 2021, 8:55pm

If you're unwilling to use one of the via-slice routes, then one-by-one sounds great. Just use an array pattern and it's super-clear:

pub fn demo(x: [u8; 8]) -> [[u8; 4]; 2] {
    let [a, b, c, d, e, f, g, h] = x;
    [[a, b, c, d], [e, f, g, h]]
}

https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=ad90273626b5139a25033c99bb987cc1

And it compiles away to nothing:

define i64 @_ZN10playground4demo17h1f02f025652a6996E(i64 returned %0) unnamed_addr #0 {
start:
  ret i64 %0
}

Of course, there are other ways too that don't use .unwrap() but are (IMHO) less clear, like

pub fn demo2(x: [u8; 8]) -> [[u8; 4]; 2] {
    let x = u64::from_le_bytes(x);
    [u32::to_le_bytes(x as _), u32::to_le_bytes((x >> 32) as _)]
}

system · March 11, 2022, 8:56pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Generic fixed length array? help	5	1783	May 15, 2021
How to convert &[i8] to &[u8]?	20	15317	July 29, 2022
Convert slice &[u8] to &[u8; 4]	2	11350	November 19, 2021
Safe way to cast &[u64;8] to &[u8;8]?	6	1143	February 9, 2022
Converting String to an array of u8 help	5	7833	August 27, 2022

[u8; 8] to two [u8; 4]

Related Topics