Emulating array, slice, Vec in a library type

Hello,

I’d like to ask for advice on what for me is a tricky design decision. Here is my current design, followed by some questions.

I’m working on a set of interrelated types that mirror the standard library’s array (static allocation), Vec (dynamic allocation), and slice (reference) – but for a very specific use case:

I’m dealing with sequences of “indices”, where each index itself is a sequence of “atoms”:

#[derive(Copy, Clone, PartialEq, Eq, Debug)]
pub struct Atom {
    // ...
}

Indices are represented as a sequence of elements

pub union Element {
    pub atom: Atom,
    pub len: usize,
}

The first element specifies the length of the index (number of atoms) and is followed by the announced number of elements storing the atoms.

Now a sequence of such indices can be represented by simply keeping them one after another in a single range of memory. A custom DST is used for this:

pub struct IndexSeq<T>(T)
where
    T: ?Sized,
    T: AsRef<[Element]>;

impl<T> IndexSeq<T>
where
    T: AsRef<[Element]>,
{
    /// Safety: the provided elements must consist of an arbitrary number of blocks
    /// of the following form:
    /// One `Element { len: N }` followed by N times `Element { atom: ... }`.
    pub unsafe fn new_unchecked(elements: T) -> Self {
        Self(elements)
    }
}

(Users of the library will use a macro that assures proper structure before calling new_unchecked.)

We now have:

  • IndexSeq<[Element; N]> mirrors array,
  • IndexSeq<Vec<Element>> mirrors Vec,
  • IndexSeq<[Element]> mirrors slice,

In particular, one can write non-generic functions that take a slice

pub fn foo(indices: &IndexSeq<[Element]>) {
    todo!();
}

and can be given not only a slice, but also an “index array” foo(&arr) or an “index vector” foo(&vec).

For this to work, I had to implement the trait Deref:

use std::ops::Deref;

impl<T> Deref for IndexSeq<T>
where
    T: AsRef<[Element]>,
{
    type Target = IndexSeq<[Element]>;

    fn deref(&self) -> &Self::Target {
        let data = self.0.as_ref();
        // Safety: It is an invariant that the payload of an IndexSeq can be
        // referenced as a slice of elements that is valid.  Here, an IndexSeq
        // is created whose payload is such a valid slice.
        unsafe { &*(data as *const [Element] as *const IndexSeq<[Element]>) }
    }
}

The above works. (I have only shown the bare minimum that compiles.) But I have some doubts:

  • Is this whole construction (using a custom DST with impl Deref) reasonable? Alternatively, I could have followed a design used by some Rust multi-dimensional array libraries described here:
    mdarray/DESIGN.md at main · fre-hu/mdarray · GitHub. But in my case, a regular wide pointer is enough to represent a slice, so the above design seems nicer.

  • I have to write unsafe code when implementing actual operations on index sequences (not shown here), like iterating over the indices in a sequence. There is no way around this if I want to use a union. But the unsafe block in fn deref above is different. In a way it should be possible to replace it by a simple IndexSeq(data) because the data is known to be valid, only that this leads to infinite recursion. Is there some way to avoid unsafe here?

  • Is it a good idea to implement Deref as I did in terms of AsRef? I tried to replace the bound T: AsRef<[Element]> by T: Deref<Target = [Element]>, but since Rust slices do not implement Deref this does not work.

Many thanks to those who followed up to here! :slight_smile:

There isn’t a way to avoid manual use of unsafe – or of a crate (e.g. offering a macro) that encapsulates it for you – for wrapping a type in a transparent wrapper behind indirection. On that note, your code should definitely add something like #[repr(transparent)] to struct IndexSeq in order to make this pointer cast completely sound; without it, there isn’t any sufficiently strong guarantee about the layout, and the Rust compiler would be free e.g. to decide on adding some padding bytes or similar.

1 Like

Trying to answer my third question myself.

Implementing Deref for a type means that the type is a smart pointer. So the above impl Deref means:

IndexSeq<T> is a smart pointer (equivalent to &IndexSeq<[Element]>) precisely when the type &T can be converted efficiently to &[Element] (=T: AsRef<[Element]>).