Pushing an enum containing a struct to one Vec and a reference to the struct to another

I have a situation that looks something like this:

struct Duration
{
    // --snip--
}

enum StaffObjectType
{
    Duration(Duration)
    // --snip --
}

struct StaffObject
{
    object_type: StaffObjectType
    // --snip --
}

fn add_duration_object(objects_in_system_slice: &mut Vec<&Duration>,
    staff_contents: &mut Vec<StaffObject>)
{
    let duration = Duration{/*--snip--*/};
    objects_in_system_slice.push(&duration);
    staff_contents.push(StaffObject{object_type: StaffObjectType::Duration(duration)/*--snip--*/});        
}

I need add_duration_object to create an instance of Duration, push a StaffObject that contains that instance to staff_contents, and push a reference to the instance to objects_in_system_slice. The attempt above produces an error that duration goes out of scope at the end of the function, and so doesn’t live long enough to validly push it to objects_in_system_slice. I thought to try it because I thought giving staff_contents ownership of duration would extend the lifetime of duration, and that the push to objects_in_system_slice would retroactively enjoy this extension too. Is this an unsound notion, or is it just that Rust isn’t able to figure out that it’s safe?

Switching the order of the pushes is the other approach that comes to mind, so that I’d take the reference to the Duration from its position inside staff_contents, and it would already be clear that it lives beyond the scope of the function. The trouble with this is getting at it once it’s wrapped in a StaffObject. First of all, it seems that I’d have to use an if let to extract the Duration from the object_type field, which doesn’t seem like it should be necessary given that I already know what variant the field contains. And if I resign myself to this, I can’t figure out how to set up the if let in a way that doesn’t do any illegal moves.

There are three key issues that are going to make this difficult:

  1. You can’t take a reference to something and then move it. This would invalidate the reference.
  2. Pushing a value onto a vector moves the value.
  3. Pushing a value onto a vector may move any value already stored in the vector because of reallocation.

There are a few possible solutions, that may or may not work for your particular situation.

  1. Take a reference to the value after it’s been pushed onto the vector:
fn two_vecs<'a>(vals: &'a mut Vec<Duration>, refs: &mut Vec<&'a Duration>) {
    let d = Duration::default();
    
    vals.push(d);
    refs.push(vals.last().expect("just pushed"));
}

Note however that due to the way the lifetimes are coupled here, this prevents mutable access to the vector vals once you’ve taken the reference to the value inside.

  1. You can use an Arena from the typed_arena crate. An Arena is a special type of data structure that doesn’t reallocate or deallocate any of its values until it is dropped. However it does not support iterating over its contents, you’ll have to store the references somewhere:
fn arena<'a>(vals: &'a Arena<Duration>, refs: &mut Vec<&'a Duration>) {
    let d = Duration::default();
    
    let r = vals.alloc(d);
    refs.push(r);
}
  1. You can use a reference-counted allocation for your Duration using std::rc::Rc (or std::sync::Arc if you need it to be thread-safe) such that both vectors can share ownership of the value:
fn rc(vec1: &mut Vec<Rc<Duration>>, vec2: &mut Vec<Rc<Duration>>) {
    let d = Rc::new(Duration::default());
    
    vec1.push(d.clone()); // cheap clone
    vec2.push(d);
}
  1. Alternatively, you could just clone your Duration directly and save the indirection. If your Duration doesn’t take up a lot of space this could be much more efficient:
fn clone(vec1: &mut Vec<Duration>, vec2: &mut Vec<Duration>) {
    let d = Duration::default();
    
    vec1.push(d.clone());
    vec2.push(d);
}
2 Likes

One of the things I need is to be able to mutate elements of staff_contents later, and have objects_in_system_slice give access to the current state of those elements. That requirement rules out cloning, right?

Yes, but it also rules out shared references. Just to be clear, you want to mutate the Duration value? In order to have a value that is shared and that can also be mutated you'll need to use interior mutability. Specifically, probably Rc<RefCell<Duration>> (or Arc<Mutex<Duration>> if you need to be thread-safe).

Good to know. I’ll look into that.

To give a little more background, all StaffObjects can be understood as lying somewhere on a plane. The program stores a Vec<Staff>, where a Staff acts as a horizontal line across the plane, and contains a Vec<StaffObject> staff_contents field that represents the StaffObjects that lie on that line. However, it’s also sometimes necessary to iterate through StaffObjects by shared x coordinate rather than shared y coordinate. This could be done by iterating through the staff_contents field of each Staff and stopping if and when a StaffObject with the desired x coordinate is reached, but it seemed more convenient to maintain a horizontal line representation and a vertical line representation of the data simultaneously so that it can be accessed through whichever is most applicable to the task at hand. The idea, then, is to have a Vec<SystemSlice> where a SystemSlice represents a vertical line through the plane, and refers in some way to which StaffObjects belong to that line. It’s necessary to be able to mutate StaffObjects via at least one of these two representations. Does this indeed sound like a job for RefCell?

Yeah, Rc<RefCell>> would work well in that situation. Alternatively, your “vertical lines” could just map to indices that you can lookup in the vector of “horizontal lines”. Depending on your use case, keeping those indices in sync might be easy or hard. If it’s hard, you could store all your StaffObjects in some other vector and have both “horizontal” and “vertical” lines have indices into that storage vector. This last solution is somewhat like an Entity Component System (ECS).

1 Like

The one reason I gravitated away from just storing indices is that the vertical line case is only relevant to StaffObjects whose object_type is the Duration variant - Staffs will contain StaffObjects with various variants in the object_type field, but SystemSlices will include only ones of type Duration. If I stored indices to StaffObjects in the SystemSlices, I would have to do an if let to access the fields of the Duration even though I already know that’s the variant it’s going to be. I figured storing references to the Durations themselves would let me get away without the redundant step. Of course, the actual runtime cost of the if let is presumably trivial, so I recognize that if avoiding it necessarily complicates the program significantly, it’s not worth it. I just wasn’t sure whether complication would be inevitable.

I would probably pay the cost of an “irrefutable” if let in some cases to avoid the cost of Rc<RefCell<>> in all cases.

That is basically exactly the kind of thing where ECS/Entity-Component-Systems excel!

"Duration" would be a component. Either StaffObject or Staff is your Entity.

It'll probably take quite a bit of re-architecting, but if you have multiple objects that hold different sub-objects in varying combinations, you have two options: pointer spaghetti, or ECS. And Rust basically rules out the pointer spaghetti...

So…instead of having StaffObject be a struct or enum that contains fields for all the kinds of data that are relevant to an object, it would just be an integer or something, with the guarantee that no two different objects will get a StaffObject of the same value. For each kind of data an object might have, there is a data structure (the component) where each entry consists of an instance of the data and the StaffObject value to which the instance belongs. Is that in the ballpark of what the architecture would look like?

Yup, that’s the rough idea.
Staff can be a newtype (sort-of alias) of a usize or a trivial wrapper struct with only a usize field.
The “Entity” is nothing else but an index into your “Systems”, which are either Vec or HashMap containing your “Components”.

I personally liked Kyren’s introduction (Long-Form blog of her RustConf keynote), which introduces the concept from first principles/first-annoyances-with-other-architectures:

https://kyren.github.io/2018/09/14/rustconf-talk.html

She/Chucklefish studio use it for their game, but the pattern applies everywhere you have flexibly composed “objects”.

ECS is the data-oriented alternative to object-oriented programming.
OOP uses a sea of objects with everything having a pointer to half the other things; this is more general, but everything needs pointer-chasing
ECS reaches the same by having sparse matrices of the composable units.

P.s. The standard solution for Rusty ECS that everyone seems to center on is SPECS

Edit: spelling, grammar

1 Like