Collection with Minimum size

I am currently building a library (GitHub - phillord/horned-owl) where I have the need to express "at least two, but any number" of items.

Currently, my struct is specified like so:

DisjointUnion(Vec<Class>)

That is any number of classes. This allows me to construct illegal data compared to the spec which says "at least two".

The best solution I have so far is to change the type to:

DisjointUnion(Class, Class, Vec<Class>)

(that is two classes and the rest, rather like Java does variadic calls).

or alternatively create a new Vec wrapper that does all in one:

pub struct VecTwoPlus(A, A, Vec<A>)
DisjointUnion(Vec2plus<Class>)

I can't see anyway in Rust to make VecTwoPlus generic over the number, but I don't think this is a problem since in this case, the number is always two.

Am I missing something simpler? Or has anyone done it already?

With const generics (still in progress) you'll be able to write something like:

struct VecNPlus<A, const N: usize>([A; N], Vec<A>);

Keep in mind that none of these solutions will let you treat your elements as a contiguous &[A] slice of memory, since your first elements are in a different location. An option that does slice is SmallVec.

It depends on how you want to use it, but if you're in control of creating/modifying the Vecs, the simplest solution would be to have a type

pub struct VecTwoPlus<T>(Vec<T>);

impl VecTwoPlus {
    pub fn new(vec: Vec<T>) -> Option<Self> {
        if vec.len() > 2 {
            Some(Self(vec))
        } else {
            None
        }
    }

    pub unsafe new_unchecked(vec: Vec<T>) -> Self {
        Self(vec)
    }
}

If these are the only ways to get one of these vecs, and the user can't pop elements or something like that, then you get the ability to ensure that there are two elements as well as the benefit of the data just being a plain old Vec inside, meaning you can slice it.

2 Likes

There's a trade-off between forcing the minimum size at the type level, or just maintaining it at the API level. The compiler will check you on the former, but the latter is probably simpler to write and use.

If I could do this:

struct VecNPlus<A, const N: usize>([A; N], Vec<A>);

Then, do I need to do this at all? Why not just drop the Vec?

The compiler will check you on the former, but the latter is probably simpler to write and use.

Yes, I know. I am debating this point at the moment. It's an open question as to whether this is too much of a PITA to bother with. I can enforce this at render/parse time which will pick up any errors. Although, of course, it may pick them up a different place from where they are caused. I think I will try a test implementation of Vec2Plus, and then decide.

Each VecNPlus<A, N> with different N make distinct types. So if you want to use this for a single type of at least two elements, then still you need some place to store the extras.

If you want a type that's always exactly N elements, that's just an array -- [A; N].

Ah, yes, good point.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.