Structs with inner struct having reference to outer struct's other member

Hi. I would like some help with the following situation. I now see that the following situation is of a self-referencing struct but I don't see how to solve this.

I have several kinds of operations and some operations can call other operations. These operations are grouped in an enum Ops. Each operation (enum variant) can take some data by reference to operate on. I am intentionally not passing ownership of the data as several operations work on the same data and I want to avoid cloning it.
Following is a small example where Foo and Bar are different operations and Bar uses Foo and passes to it some data that it owns (in init).

enum Ops<'a> {
  Foo(Foo<'a>),
  Bar(Bar<'a>),
}

struct Foo<'a> {
  pub slice: Option<&'a [usize]>
}

struct Bar<'a> {
  pub f: Option<Foo<'a>>,
  pub v: Vec<usize>
}

impl<'a> Foo<'a> {
  fn new() -> Self {
      Self {slice: None}
  }
  
  fn init(&mut self, slice: &'a [usize]) {
      self.slice = Some(slice);
  }
  
  fn finalize(self) {
      // finish doing Foo
  }
}

impl<'a> Bar<'a> {
  fn new() -> Self {
      Self {v: vec![], f: None}
  }

  fn init(&'a mut self) {
      self.v = vec![1,2,3];
      let mut f = Foo::new();
      f.init(&self.v);
      self.f = Some(f);
  }
 
  fn finalize(self) {
      // finish doing Foo
      // then finish doing Bar
  }
}

fn main() {
  let mut ops = Vec::<Ops>::new();
  let mut bar = Bar::new();
  bar.init();
  ops.push(Ops::Bar(bar));
}

I get the following error:

error[E0505]: cannot move out of `bar` because it is borrowed

bar.init();
   |   ---------- borrow of `bar` occurs here
 ops.push(Ops::Bar(bar));
   |                     ^^^
   |                     |
   |                     move out of `bar` occurs here
   |                     borrow later used here

The error makes sense as bar.f keeps the address of variable bar.v and that address is invalidated when bar itself is moved in the enum variant. Is there a way to fix this?

Another thing I don't get is that compilation still fails by not creating Foo inside Bar::init as shown below.

impl<'a> Bar<'a> {
  ....

  fn init(&'a mut self) {
    self.v = vec![1,2,3];
  }
}

But if i remove the lifetime specifier from self in init like below, it works

impl<'a> Bar<'a> {
  ....

  fn init(&mut self) {
    self.v = vec![1,2,3];
  }
}

Update:

  1. The operations are stateful and multi-step so Bar not owning Foo is not an option as for example, for doing step 1 ( init ) of Bar , it needs Foo and it's step 1 ( init ) done and in step 2 ( finalize ) of Bar , Bar 's Foo will also do its step 2 ( finalize ).
  2. Each operation can invoke other operations by calling init and finalize

There's two kinds of idiomatic solutions, and the best one depends on how you are calling these operations:

  • The operations borrow their data, and regenerate sub-operations on demand from owned data.
  • The operations store no data themselves; instead, the data is passed to them when they are called.

Here's an example of how you could do the former:

enum Ops<'a> {
    Foo(Foo<'a>),
    Bar(Bar),
}

struct Foo<'a> {
    pub slice: Option<&'a [usize]>,
}

struct Bar {
    pub have_f: bool,
    pub v: Vec<usize>,
}

impl<'a> Foo<'a> {
    fn new() -> Self {
        Self { slice: None }
    }

    fn init(&mut self, slice: &'a [usize]) {
        self.slice = Some(slice);
    }
}

impl Bar {
    fn new() -> Self {
        Self {
            v: vec![],
            have_f: false,
        }
    }

    fn init(&mut self) {
        self.v = vec![1, 2, 3];
        self.have_f = true;
    }

    fn get_f(&self) -> Option<Foo> {
        if !self.have_f {
            return None;
        }
        let mut f = Foo::new();
        f.init(&self.v);
        Some(f)
    }
}

fn main() {
    let mut ops = Vec::<Ops>::new();
    let mut bar = Bar::new();
    bar.init();
    ops.push(Ops::Bar(bar));
}

Here, we don't store f in Bar. Instead, we have a method get_f(), which generates a Foo object with the same lifetime as the &self borrow. However, if constructing a Foo is expensive, then this solution wouldn't be ideal. I can't give any more advice without more information.

impl<'a> Bar<'a> {
  fn init(&'a mut self) {

init takes a &'a mut Bar<'a>. Lifetimes behind a &mut cannot be changed (e.g. automatically shortened at the call site). In combination with specifying that the outer borrow must have the same length (&'a mut), this causes the struct to be mutably (exclusively) borrowed for the rest of its lifetime. After calling init, the only way to interact with the struct again would be via some sort of return value of init.

So in main:

  let mut bar = Bar::new();
  bar.init();
  ops.push(Ops::Bar(bar));

When you try to move bar into Ops::Bar, you get a borrow check error -- bar is still exclusively borrowed.

Using &mut self instead of &'a mut self on init allows the outer borrow to be arbitrarily shorter than the lifetime of the struct, so this problem goes away. (It also prevents you from creating the self-referential struct, which is find since you're trying to get away from that.)

1 Like

Thanks. However, this does not work for me as the operations are multi-step like for doing step 1 of Bar, it needs Foo and its step 1 done and in step 2 of Bar, Bar's Foo will also do its step 2.

I will edit my post with this info.

I suggest you make another reply with the changes instead of editing the original so that new people coming to the thread can get the context of the discussion.
If you prefer to edit the original then I suggest not modifying what's already written and instead add the changes on top (or bottom) of the message.

I chaged the original post by indicating the updates.

You could separate out the owning structs from the borrowing hierarchy.

I see what you're saying but in my case the data needed by Foo is created during Bar's init. Also, I have other Bar like operations that invoke Foo (and others) so creating a separate Root for each will be tedious. As a last resort, I could iterate over all operations in the ops vector and figure out what all data is needed and allocate that before making another pass over ops where I actually call init but I would like to avoid that.

This looks like a fairly bad case of the XY problem, and I have trouble transforming the original code in any way that does not trivialize it. You are using initialization and finalization as examples of reasons why you would need multiple stages of functions running on an object heirarchy; yet, complex initialization is a huge code smell in rust.

In rust, ideally, objects should only be constructed once the data exists to construct them. That is to say, initialization in rust should occur in a single stage (when the value is created) except in exceptional circumstances (and any case of these circumstances should be regarded with suspicion).

When separate stages of initialization are unavoidable, a common idiomatic solution is to divide it into separate types that each have one stage of initialization. Quinedot's Root sort of does this.

If data for one thing is created during the initialization of another, and you really don't want to expose this step, then one possible technique is to invert the control flow. Rather than returning a value:

impl Thing {
    pub fn new() -> Thing { ... }
}

// usage
let thing = Thing::new();
let result = do_stuff_to(&thing);

...you instead make a temporary and pass it into a callback.

impl Thing {
    pub fn scope<B>(callback: impl FnOnce(&Thing) -> B) -> B {
        let thing = { ... };
        callback(&thing)
        // the temporary `thing` gets dropped at the end but that's okay
        // because `output` is not allowed to borrow from it
    }
}

// usage
let result = Thing::scope(|thing| {
    do_stuff_to(thing)
});

The nice thing about this is that it even gives an obvious place to put your "finalization".
Here's an example doing this with your types. I removed the Options because they no longer seemed to serve a role (it looked like they were there to enable two-phase initialization which is no longer necessary); but there's nothing preventing them from being added back.

However, consideration will need to be made regarding why the original code had a Vec<Ops>. The fn scope strategy is intended for creating a fixed DAG of structures. If on the other hand you are dynamically constructing a vector of Ops, then this strategy will introduce the risk of stack overflow.


More simply, you can also try using reference counting to build tree- and graph-like structures, but you will need to deal with problems of interior mutability (e.g. using runtime borrow-checking a la RefCell) and care must be taken about creating ref-cycles.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.