Announcing: `index-ext`

crate.io docs.rs lib.rs

This crate makes it more ergonomic to use arbitrary integer types as indices. This is especially important for programs where indices are dictated by an external standard. Another reason could be platform or performance requirements due to which usize is the wrong choice. With the types and trait provided here, this just works for smaller and larger integer types than usize and even for different signedness. It simply treats a failed conversion as an invalid index which coincides with the mathematical interpretation of all numbers involved.

use index_ext::Int;
let buffer = [0; 256];
assert_eq!(buffer[Int(255_u8)], 0);
assert_eq!(buffer[Int(255_i32)], 0);
assert_eq!(buffer.get_int(-1_i8), None);
assert_eq!(buffer.get_int(u128::max_value()), None);

Nightly features

On Rust nightly, depending on #[feature(const_generics)], there is also support for indices that return an array instead of a slice. This has historically been observed a few times as a pain point, where it is not possible to directly assign buf[..3] to an array of length 3. index-ext solves this by providing an index struct that has a type parameter for the length, RangeTo<const N: usize>. Amazingly, not only does this work but also type inference is smart enough to alleviate us from the need of having to write down the length entirely in many usage scenarios.

use index_ext::array::RangeTo;
let rgba = [0; 4];
// Assigning to type deduces length parameter.
let rgb: [u8; 3] = rgba[RangeTo];
// As does usage of an array pattern!
let [r, g, b] = &rgba[RangeTo];

Without this crate, this currently looks very unergonomic:

use std::convert::{TryFrom, TryInto};

let rgba = [0; 4];
let rgb: [u8; 3] = rgba[..3].try_into().unwrap();
let [r, g, b] = <[u8; 3]>::try_from(&rgba[..3]).unwrap();

Real world example

This can also address some validation concerns for network data. For example, let's look at TCP sequence numbers, represented as a i32. The payload from incoming packets needs to be written to a buffer, where the first byte in that buffer has a base sequence number corresponding to the beginning of bytes we have not handed to the consuming socket of that TCP stream. We'd like to ignore packets that fall outside the allocated buffer space, as well as retransmitted packets for bytes that have already been consumed. This would usually require first a fallible conversion to usize and then an additional check of the index. The extra ceremony involved makes this code brittle in a few regards:

  • Dealing with different signedness correctly is complicated. You shouldn't be incentivized to ditch signedness, which effectively solves the wrapping sequence number semantics of TCP, for the sake of easier indexing.
  • It is simpler to write an incorrect index as usize cast.
  • Naming the platform dependent usize risks writing code that is more platform dependent than necessary.

Whereas with index_ext this just works:

use index_ext::Int;
let base: i32 = get_current_base();
let offset: i32 = packet.seq().wrapping_sub(base);
// Return if packet is outside current buffer space.
let destination = buffer.get_int_mut(offset..)?;

// Insert all new data for which we have enough space.
let data: &[u8] = packet.data();
let len = cmp::min(data.len(), destination.len());
destination[..len].copy_from_slice(&data[..len]);
9 Likes

Thanks. This is much cleaner than the simple, limited-utility macro_rules macro that I created as my first macro in Rust when I tired of the many, ugly as usize casts in my first serious code.

1 Like

@ZiCog, this looks like it would help some of that code you were messing with in "as" considered harmful? .

1 Like

Thanks for finding that thread for me, the crate was indeed very much inspired by its discussion. I'm also going to page @fintelia who also had a fight with this inconvenience. There have also been a few threads on internals. Any sort of adding integer converters (traits) felt motivated in non-negligible parts (from personal judgement and because the main example was some form of enabling usize: From<u32>) due to being able to use these as indices on platforms with large enough pointers.

1 Like

Thanks for the heads up on this. Yes, it looks like it could be very useful in such situations.

As it happens, the code I was "messing" with that caused me ask "'as' considered harmful?" had most of it's "as" removed by other means and has been running in production since.

But "index-ext" is something to bear in mind for the future.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.