CONST bounded slice should be Sized

In other words, [][..] should be of type [_;0] instead of type [_].

I need to manipulate mac6 addresses stored as u64 in my database. The eponymous crate macaddr seems rather adequate as a model of mac addresses, including parsing and printing.

However, converting from and into u64 proved to be a challenge, and smells both less pretty and less efficient than it should be.

In rust, I'd expect stuffs containing the word as, implying some sort of zero copy, or at least some easily implemented From/TryFrom. Since it doesn't exist, I did it myself.

Ideally, I would like something like :

fn mac6_to_u64(addr: &macaddr::addr6::MacAddr6) -> u64 {
    u64::from_bytes(addr.as_bytes())
}
pub fn mac6_from_u64(mac: u64) -> macaddr::addr6::MacAddr6 {
    MacAddr6::from(mac.to_bytes())
}

Rust and its ecosystem being pedantic, it doesn't work, and for good reasons. Indeed, a mac6 address is 6 bytes long, versus the 8 bytes of an u64. What's more, endianness matters.

With that in mind, I'd expect to be able to write something along the lines of:

fn mac6_to_u64(addr: &macaddr::addr6::MacAddr6) -> u64 {
    let mut mac = [0; 8];
    addr.as_be_bytes().clone_into(&mut mac[2..]);
    u64::from_be_bytes(mac)
}
pub fn mac6_from_u64(mac: u64) -> macaddr::addr6::MacAddr6 {
    MacAddr6::from(mac.to_be_bytes()[2..])
}

plus some unit testing to check the correctness of the offset and endianness.

There are a couple issues however, and I expect them to be faces of the same coin:

  • addr.as_be_bytes() is of type [u8] and not [u8; 6], making it impossible to clone into anything other than a Vec.
    On the surface, it may look like an oversight from the writers of macaddr, but I suspect it is actually very hard to output a [u8; 6] without ugly copies and expect, or crazy unsafe techniques.
    Now mac.to_be_bytes() is of type [u8, 8], so it doesn't seem impossible to output constant sized array.
  • mac[2..] (or mac[2..8] for what matters) is of type [u8], and not [u8; 6], another reason why cloning into this slice is impossible.
  • MacAddr6 implements From<[u8; 6]>, as it should, but mac.to_be_bytes()[2..] is of type [u8].

So the code I settled for is as follows:

fn mac6_to_u64(addr: &macaddr::addr6::MacAddr6) -> u64 {
    let mac = addr.as_bytes();
    u64::from_be_bytes([0, 0, mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]])
}
pub fn mac6_from_u64(mac: u64) -> macaddr::addr6::MacAddr6 {
    let mac = mac.to_be_bytes();
    MacAddr6::new(mac[2], mac[3], mac[4], mac[5], mac[6], mac[7])
}

Which feels ugly and wasteful.

In C++, I'd expect Macaddr6 could be modeled as an adapter over char[6] instead of being its own class/type.
Adapters are ugly with regard to encapsulation, but they still feel better than Rust's mental gymnastics when in comes to doing absolutely nothing, like is almost the case here.
Now with that being said, in C++, endianness and size would be foot-guns, not compile error and conscious explicit choice forced by the language or the library.

Still, if Rust is going to be a PITA (Paramount In Theoretical Analysis) about types modelisation, at least it would be nice to have the model be accurate, which I don't think is currently possible in safe rust.
If slices whose bounds are known at compile time where typed with accounting their known size, the second solution would be possible.
I'm not entirely convinced it could spare a clone in this particular example, but I believe it would open opportunities for more Zero Copy stuff, and more things to live in the stack or out of boxes (due to being Sized), on top of allowing more faithful representations.

Here is an example without dependencies illustrating the current state of affairs:

Minimal playground

fn main() {
    let names = ["toto", "titi"];
    
    // Compile time error
    println!("{}", names[3]);
    
    // Trying to cast a slice [T] of known constant size as a [T; SIZE] is impossible
    //let slice: [&str;0] = names[0..0];
    // Runtime panic, even though in theory all the information is there to deny an unconditional panic
    println!("{:?}", &names[0..6]);
}

(Playground)

The kind of static analysis I'm asking for already exists for indexing, in the form of denying unconditional panics (making a distinction between indexing with a variable and a constant)

Now there are open philosophical questions such as: what should be the type of [][8..3] ? Never ? not the most consistent; [_;-5] ? -5 is not a usize; [_] ? That would be a particular case. But so would be indexing with an variable.

The last solution is my preferred one. More generally, while I'm sure implementing slice size inference is no easy task, I don't think degenerate cases which should be denied as unconditional panics should dictate how we think about design consistency.

Now maybe I overlooked something very simple for my particular problem, maybe there are reasons NOT to infer slice sizes, or maybe it is already an RFC in the works.
I haven't had much luck finding info about the subject, hence I submit it here.

Maybe you prefer something like this over your current solution? I'm using [u8; 6] here instead of the MacAddr6 wrapper type, as it is unavailable in the playground:

fn mac6_to_u64(addr: &[u8; 6]) -> u64 {
    let mut mac = [0; 8];
    (&mut mac[2..]).copy_from_slice(addr);
    u64::from_be_bytes(mac)
}

pub fn mac6_from_u64(mac: u64) -> [u8; 6] {
    mac.to_be_bytes()[2..].try_into().unwrap()
}

Playground.

3 Likes

Just a couple quick drive-by comments.

Indexing an array by usize is a builtin operation. Indexing by a range is not. It would need to become so to reasonably be a compile time error.

The type of a failed index-by-range must still be a slice, but the expanded expression could theoretically be something more complicated, ala potentially panicking addition expressions.


Pattern matching can extract subarrays.

pub fn mac6_from_u64(mac: u64) -> [u8; 6] {
    let [_, _, mac @ ..] = mac.to_be_bytes();
    mac
}
4 Likes

It sounds like you might be interested in the relatively-new https://doc.rust-lang.org/std/primitive.slice.html#method.last_chunk?

Rust 1.77 added a whole bunch of new APIs specifically to make it easier to get arrays from slices: Announcing Rust 1.77.0 | Rust Blog

So if you want the last 6 elements, for example, that's foo.last_chunk::<6>(), rather than foo.get(foo.len()-6..).

4 Likes

I think making slicing return different types depending on the const-ness of the length is a really bad idea. Ideally we'd just have an inherent method... something like fn array<const START: usize, const END: usize>(&self) -> Option<[&T; { END - START }]>. So long as the relevant parts of GCE were made usable, std could use it with no problem (it's allowed to use unstable features freely). Though perhaps there'd be an issue with using GCE in the public interface?

1 Like