Emulating array, slice, Vec in a library type

Hello,

I’d like to ask for advice on what for me is a tricky design decision. Here is my current design, followed by some questions.

I’m working on a set of interrelated types that mirror the standard library’s array (static allocation), Vec (dynamic allocation), and slice (reference) – but for a very specific use case:

I’m dealing with sequences of “indices”, where each index itself is a sequence of “atoms”:

#[derive(Copy, Clone, PartialEq, Eq, Debug)]
pub struct Atom {
    // ...
}

Indices are represented as a sequence of elements

pub union Element {
    pub atom: Atom,
    pub len: usize,
}

The first element specifies the length of the index (number of atoms) and is followed by the announced number of elements storing the atoms.

Now a sequence of such indices can be represented by simply keeping them one after another in a single range of memory. A custom DST is used for this:

pub struct IndexSeq<T>(T)
where
    T: ?Sized,
    T: AsRef<[Element]>;

impl<T> IndexSeq<T>
where
    T: AsRef<[Element]>,
{
    /// Safety: the provided elements must consist of an arbitrary number of blocks
    /// of the following form:
    /// One `Element { len: N }` followed by N times `Element { atom: ... }`.
    pub unsafe fn new_unchecked(elements: T) -> Self {
        Self(elements)
    }
}

(Users of the library will use a macro that assures proper structure before calling new_unchecked.)

We now have:

  • IndexSeq<[Element; N]> mirrors array,
  • IndexSeq<Vec<Element>> mirrors Vec,
  • IndexSeq<[Element]> mirrors slice,

In particular, one can write non-generic functions that take a slice

pub fn foo(indices: &IndexSeq<[Element]>) {
    todo!();
}

and can be given not only a slice, but also an “index array” foo(&arr) or an “index vector” foo(&vec).

For this to work, I had to implement the trait Deref:

use std::ops::Deref;

impl<T> Deref for IndexSeq<T>
where
    T: AsRef<[Element]>,
{
    type Target = IndexSeq<[Element]>;

    fn deref(&self) -> &Self::Target {
        let data = self.0.as_ref();
        // Safety: It is an invariant that the payload of an IndexSeq can be
        // referenced as a slice of elements that is valid.  Here, an IndexSeq
        // is created whose payload is such a valid slice.
        unsafe { &*(data as *const [Element] as *const IndexSeq<[Element]>) }
    }
}

The above works. (I have only shown the bare minimum that compiles.) But I have some doubts:

  • Is this whole construction (using a custom DST with impl Deref) reasonable? Alternatively, I could have followed a design used by some Rust multi-dimensional array libraries described here:
    mdarray/DESIGN.md at main · fre-hu/mdarray · GitHub. But in my case, a regular wide pointer is enough to represent a slice, so the above design seems nicer.

  • I have to write unsafe code when implementing actual operations on index sequences (not shown here), like iterating over the indices in a sequence. There is no way around this if I want to use a union. But the unsafe block in fn deref above is different. In a way it should be possible to replace it by a simple IndexSeq(data) because the data is known to be valid, only that this leads to infinite recursion. Is there some way to avoid unsafe here?

  • Is it a good idea to implement Deref as I did in terms of AsRef? I tried to replace the bound T: AsRef<[Element]> by T: Deref<Target = [Element]>, but since Rust slices do not implement Deref this does not work.

Many thanks to those who followed up to here! :slight_smile:

There isn’t a way to avoid manual use of unsafe – or of a crate (e.g. offering a macro) that encapsulates it for you – for wrapping a type in a transparent wrapper behind indirection. On that note, your code should definitely add something like #[repr(transparent)] to struct IndexSeq in order to make this pointer cast completely sound; without it, there isn’t any sufficiently strong guarantee about the layout, and the Rust compiler would be free e.g. to decide on adding some padding bytes or similar.

Trying to answer my third question myself.

Implementing Deref for a type means that the type is a smart pointer. So the above impl Deref means:

IndexSeq<T> is a smart pointer (equivalent to &IndexSeq<[Element]>) precisely when the type &T can be converted efficiently to &[Element] (=T: AsRef<[Element]>).