Constant ranges to get arrays from slices

Summary

I'd like to discuss the idea of adding new types that implement core::slice::SliceIndex to be able to get arrays from slices.

I'm proposed an API where constant-time versions of core::ops::Range (and other ranges) are introduced to libcore so the standard slice.get(range) and slice[range] operations can be used to obtain arrays.

Motivation

I've found a common code pattern used at my work when decoding data from raw bytes and it is the following:

let my_integer = u64::from_le_bytes(bytes[..CONST_EQ_TO_8].try_into().expect("u64s are 8 bytes long"));

I felt uneasy about having an expect/unwrap that can easily break if someone changes CONST_EQ_TO_8 by accident. But at the same time I naively thought that the compiler wasn't smart enough to optimize the second bounds check so sometimes I even used stuff like

let my_integer = u64::from_le_bytes(unsafe { bytes[..CONST_EQ_TO_8] as *const [u8] as *const [u8; 8] });

After I discovered that I was dangerously introducing unsafe code under the excuse of performance and then discovering that the compiler is smart enough to do that without unsafe. I thought that maybe this should be in a library to prevent others from doing what I did.

How to abstract this?

My first idea was to just add a bunch of methods to [T] like:

fn get_as_array<const N: usize>(&self) -> Option<&[T; N]> {
    self.get(..N).map(|slice: &[T]| <&[T; N]>::try_from(slice).unwrap())  
}

so I spoke with @oli_obk about it and he thought that this could be done better by introducing something like a constant time version of core::ops::RangeTo and so on. So I wrote the following crate to experiment a bit with the syntax: GitHub - pvdrz/const-index. This crate compiles in stable as no nightly features were used to write it.

The goal would be to add types like this:

pub struct ConstRangeTo<const MAX: usize>;

and implement std::slice::SliceIndex to them with type Output = &[T; MAX]; so you can just use the regular get and index methods:

let my_integer = u64::from_le_bytes(bytes[ConstRangeTo<CONST_EQ_TO_8>]);

I went a bit farther and wrote a macro to build this const ranges:

let my_integer = u64::from_le_bytes(bytes[cindex!(..CONST_EQ_TO_8)]);

maybe it is a bit ambitious but it would be really cool to introduce some built-in syntax for this into rust itself:

let my_integer = u64::from_le_bytes(bytes[const ..CONST_EQ_TO_8]);

If you have any feedback or want to play with it you can clone the repository and write your own code inside examples/playground.rs. I haven't published the crate to cargo yet because I'd like to have some opinions before committing to a first version.

Thanks :slight_smile:

1 Like

This is definitely an open area of experimentation. For example, here's another way of writing your snippet from the motivation on nightly:

let my_integer = u64::from_le_bytes(bytes.as_chunks().0[0]);

https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=158e7a2787c0516cf9159d5adad46988

(And, checking the ASM, despite looking like it's far more complicated it actually optimizes down just as well -- a branch to check if the slice is long enough, then the mov.)

Given that we have [T]::first -> Option<&T>, it might make sense to start with some analogous methods like .first_chunk::<N>() -> Option<&[T; N>. That's a much lower bar than new types and trait implementations, and seem plausible to have even if we get the fancier things later -- after all, we have .first() despite .get(0) doing exactly the same thing.

The Index implementations for special Const* types are definitely cool, but especially given all the restrictions on what you can actually do with const generics right now, I don't know that they're worth it compared to just some simple methods. (This would be particularly true for anything using split_at right now instead of indexing, since that could just turn into some const generic &[T] -> (&[T; N], &[T])/&[T] -> (&[T], &[T; N]) APIs fairly naturally.)

The sugar is interesting, but I'm not sure how to have a good one given inline consts -- bytes[const { ..8 }] will soon be legal, and will do the same as bytes[CONST_OF_RANGETO_EIGHT], which doesn't return an array.

Hey Scott :slight_smile:

I think it would be OK to just add several methods like get_as_array, split_as_array_at and so on but at the same time it starts to feel a bit overwhelming. I like first_chunk although I have to say that having first_chunk() and chunks().next() returning different types might be a bit unexpected for some users.

What I like about the new const types is that you can keep using the same abstractions provided by SliceIndex if there is a good enough syntax to build all those new types (I forgot about the const {} blocks syntax so const A..B doesn't sound that good now).

About the const generics restrictions. They haven't been an issue in my implementation using stable even though I had to do some funny tricks adding extra const parameters for some ranges. Allowing exprs as const generic parameters would help a lot (or allowing const parameters as associated constants) but from what I've found it is not necessary.

I have a split_at implementation that could be easily integrated into libcore by including a struct ConstUsize<const N: usize>(); type and modifying split_at so it accepts both ConstUsize and usize via a sealed trait just like with SliceIndex. Again the largest blocking issue is finding a nicer syntax to build all those Const* types.

Thanks for your feedback :slight_smile:

I managed to get the split_at method working for both usize and ConstUsize by introducing a new sealed trait: const-index/slice_integer_index.rs at 92922019e063f9a989334dcfa109c85678ab3bb0 · pvdrz/const-index · GitHub