How do you create appropriate references in safe Rust for this case?

Let's assume I've got an aggregator struct, which has a vector of references to objects ("workers"), the methods of which need to be called on demand. The catch is: these workers also need to have a reference to the aggregator, which they must be able to call once the work is done. For instance:

struct A<'w> {
  v: Vec<&'w W<'w>>
}

impl<'w> A<'w> {
  fn do_some_work(&self) {
    for w in self.v {
      w.work();
    }
  }
  fn finish(&self) {
    println!("done!");
  }
}

struct W<'a> {
  data: i32
  agg: &'a A<'a>
}

impl<'a> W<'a> {
  fn work(&self) {
    println!("working with {}", self.data);
    self.agg.finish();
  }
}

This seems intuitively correct, but turns out to be impossible (?) to implement in reality:

fn main() {
  let mut a = A { v: Vec::new() };
  let w = W { data: 101, agg: &a };
  a.v.push(&w); // nope, modifying an immutably borrowed reference
}

What am I doing wrong? Which constructs can be used to make such things possible? Can it be done without using any Cell-like types for the aggregator (such as <Cell<Vec<&W>>?) How would you do it?

The following part is mostly an anti-pattern in Rust:

I'd personally refactor the code as:

struct Aggregator {
    workers: Vec<Worker>,
}

impl A {
    fn do_some_work(&self) {
        for worker in &self.workers {
            worker.work(self);
        }
    }

    fn finish(&self) { … }
}

struct Worker {
    data: i32,
    /* no back-ref to the aggregator */
}

impl W {
    fn work(&self, agg: &Aggregator) {
        println!("working with {}", self.data);
        agg.finish();
    }
}

If you needed to also retain ownership of the workers independently of the aggregator, then a simple solution is to feature shared ownership of the workers, which is most simply achieved through the ref-counting smart pointers of the standard library:

+ type WorkerRef = Arc<Worker>;

  struct Aggregator {
-     workers: Vec<Worker>,
+     workers: Vec<WorkerRef>,
  }

If you really need the back-reference to the aggregator, then you'd also need to use ref-counting, but in a careful manner not to perform a cycle of ownership which causes memory leaks: the back-reference could then be a weak ref-counted smart pointer, such as ::std::sync::Weak, which can be obtained by doing Arc::downgrade(reference) instead of Arc::clone(reference).

As you can see, ref-counting is no panacea, and things can become a bit hairy with .clone()s everywhere when over-used, but when used with moderation they can be quite powerful and convenient. The fully scalable approach to these things would be to use generational indexing / indices over some entity (struct of arrays) which is the one actually owning all the data: such indices can be seamlessly Copyed, and have no strings lifetimes attached to it. See this great video about this topic:

Finally, if you really want to go with the self-borrowing approach, then I recommend you use:

which will be a powerful but complex use of (higher-order) lifetimes to express the actual semantics of self-borrowing.

3 Likes

This looks like a self-referential struct, which is generally not possible to construct in Rust.

1 Like

At times like this I really wish some things were just a little bit more straightforward. Many thanks, regardless.

I hear you. This kind of thing is why Rust can be so daunting. It's one thing to read about lifetime and ownership; it's another to implement the concepts.

What can help is recognizing that Rust's borrowing rules push us to have cleaner models of ownership in our code. Learning how to adapt our code to make Rust happy doesn't just get our code to run. It also teaches us how to architect our software better in general. We can apply these lessons to other languages.

In this program you're having difficulties for two basic reasons:

  1. Ownership is unclear due to the overuse of references. The aggregator refers to workers, workers refer to the aggregator... who owns either of them? Yandros addressed this really well. Your life is easier if you figure out who owns what. The simplest thing is for the aggregator to own the workers, not just know about them. That means the references go away and, with them, half of the lifetime problems.

  2. Aggregators and workers are tightly coupled to each other. Do they both mutually need to know about the other? If you take it as a given then you end up going down the Arc/Weak/etc. rabbit hole where you try to maintain a bidirectional coupling.

Yandros shows that you can get rid of the permanent &Aggregator reference by having the aggregator pass &self when it calls Worker::work. I'm thinking you take it one step farther and make the link purely unidirectional. If work could take a generic callback function instead of &Aggregator then it wouldn't need to know about the aggregator at all. It could be given a task by anybody.

impl Aggregator {
    fn do_some_work(&self) {
        for worker in &self.workers {
            worker.work(|| self.finish());
        }
    }

    fn finish(&self) { … }
}

impl Worker {
    fn work(&self, finished: impl FnOnce() -> ()) {
        println!("working with {}", self.data);
        finished();
    }
}

This decoupling would be a good idea in other languages, even when they don't have a problem with mutually-referencing types.

I know for me it helps tamp down my frustration level if I reframe things from "Rust is such a pain in the neck" to "Rust is teaching me a lesson. It'll make me a better programmer if I figure out why it doesn't like what I'm doing." It doesn't always help, but it's at least what I try to remind myself when I notice myself lapsing into "God, I don't care!" mode.

5 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.