Struct that contains two equal size vector

Hi all, I want to implement a data structure that contains two vector and ensure that vectors have equal length. Here is the pseudo code:

struct Word {
  len: usize,
  alphabets: [Alphabet; len],
  diacritics: [Diacritic; len] 
}

How can I implement this type of data structure in Rust?

I think you either need to do so with unsafe code, or you will need a runtime check that the two lengths are equal. Our you could use one vector if tuples, which is what I'd prefer, if you don't need to be able to take slices of the individual arrays.

struct Word {
  both: Vec<(Alphabet, Diacritic)>,
}

This would give better cache locality, and be pretty simple and safe. It just isn't what you asked for.

1 Like

Note that if your public functions maintain this invariant, you only need to perform this check once when the structure is initialized.

Only if they actually need to iterate in tuples.

Perhaps a unsatisfying answer, but I'd just stick with the 2 vectors until it's actually a proven performance bottleneck. Don't optimize too early.

2 Likes

Yeah, just go for two vectors.

If your aim is to do only a single heap allocation and store the length only once, then Box<[(Alphabet, Diacritic)]> is the closest you can get in safe Rust.

Rust arrays can't be used for this, since their length is always a constant hardcoded at compile time.

If you need this exact memory layout with both arrays inline, you will have to do it very low-level, same as in C: allocate a raw chunk of memory (std::alloc), cast pointers, calculate offsets and alignments by hand, and write the data (ptr::write) yourself.

There is an option I haven't seen yet, and that is to utilize the soon to be stabilizedĀ¹ min-const-generics feature.

Essentially you give the container type a const parameter of type usize and use that to initialize the lengths of the internal arrays fields i.e. something like this:

struct Word<const LEN: usize> {
  alphabets: [Alphabet; LEN],
  diacritics: [Diacritic; LEN] 
}

Ā¹ soon here means in about a month's time.

2 Likes

But then the len variable becomes useless and the length must be known at compile time

From the OP:

Hi all, I want to implement a data structure that contains two vector and ensure that vectors have equal length.

My solution seems to fit the requirements just fine. In addition, the array length doesn't become useless. Instead it statically guarantees the invariant of both arrays being equal in length, and gives the user of the struct control over the actual value of that length.

They meant the len field. You kept it in your example, but it should be removed, because it has been replaced by the constant LEN.

2 Likes

As the other comment pointed out, I meant the len field.

Except OP mentioned "vector", which usually has a dynamic length. Your solution might still work for OP, we don't know, but it doesn't completly fit the requirements.

I've removed the len field from the example.

The OP themselves put 2 arrays rather than Vecs in their struct, so I simply went with that.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.