Maintaining two orderings of the same data

dannygale · December 6, 2020, 9:15pm

I'm fairly new to rust and have hit this conceptual wall with the compiler a couple of times. I'm sure this question and variants of it have been asked a hundred times, but I can't find the magic search terms. The short story is that I need two containers in a struct that keep references to the same set of data, but in different order. I can't figure out how to make the borrow checker happy.

I have this struct:

struct Node {
  size: usize,
  addr: usize
}

I need to keep these organized by both size and by addr. For the container, I started with what I thought would be simplest, a Vec. I'm pretty sure I the content of the Vec needs to be Rc<RefCell<Node>> because I need to reference the same Node from more than one Vec -- is this understanding right?

use std::rc::Rc;
use std::cell::RefCell;

struct VecList {
    by_addr: Vec<Rc<RefCell<Node>>>,
    by_size: Vec<Rc<RefCell<Node>>>
}

impl VecList {
    fn new(size: usize) -> Self {
        let mut s = Self {
            by_size: Vec::new(),
            by_addr: Vec::new()
        };
        let new_node = Rc::new(RefCell::new(Node { size, addr: 0 } ));
        s.by_size.push(Rc::clone(&new_node));
        s.by_addr.push(Rc::clone(&new_node));
        s
    }

    fn do_a_thing(&mut self, sz: usize) {
        // find the node to do the thing to
        let i = self.by_size.binary_search_by(
            |node| node.borrow().size.cmp(&sz))
            .unwrap_or_else(|e| e);
        
        let rcnode = &mut self.by_size[i];
        let mut node = rcnode.borrow_mut();
        
        // do the thing
        node.size -= sz;
        node.addr += sz;
        
        if node.size == 0 {
            self.remove_node(rcnode);
        }
        
    }
    
    fn remove_node(&mut self, _node: &Rc<RefCell<Node>>) {
        // remove the node from both by_addr and by_size, letting the rc go to 0 and the node to be dropped
        unimplemented!();
    }
}

(Playground)

Errors:

error[E0499]: cannot borrow `*self` as mutable more than once at a time
  --> src/lib.rs:41:13
   |
33 |         let rcnode = &mut self.by_size[i];
   |                           ------------ first mutable borrow occurs here
...
41 |             self.remove_node(rcnode);
   |             ^^^^             ------ first borrow later used here
   |             |
   |             second mutable borrow occurs here

I think I understand this error as: I have a &mut self in the do_a_thing method, then I try to take another &mut self when I look up rcnode. That basically makes sense.

What I can't figure out is how to avoid this as a design pattern. I've tried adjusting by_size and by_addr to be Boxed, but with the same error. This doesn't seem like it should be hard, which makes me think I'm missing a core rust concept here. Can anybody point me in the right direction?

RustyYato · December 6, 2020, 9:55pm

A few things to note here,

To answer your question directly, you are aliasing a unique pointer (&mut T), (to read more about why I'm calling this a unique pointer, see Rust: A Unique Perspective)

The way to fix this is by cloning the Rc and passing the clone to remove_node

You don't need Rc<RefCell<_>>

But reworking your code to not use Rc<RefCell<_>> is a bit more involved,

Here's the final product, playground

Let's go through this step by step

struct VecList {
    store: Vec<Node>,
    by_addr: Vec<usize>,
    by_size: Vec<usize>,
}

Here's the most important difference, instead of storing your shared Node in Rc, put them into a new Vec<_>, and only reference them by indices (usize here).

You can see this in action in VecList::do_a_thing

let store = self.store.as_mut_slice();
let node_index = self.by_size.binary_search_by_key(
    &sz,
    |&node_index| store[node_index].size
).unwrap_or_else(|e| e); // << probably not what you want to do

Here we first take a reference to the store slice, so that we can access it inside the closure, then we use binary_search_by_key, and use the size field on the corresponding Node in store (store[node_index].size). This allows you to "share" data without resorting to RefCell and friends, and by using the Rust compiler to verify your code.

We get an index out of binary_search_by_key, and we use that to do some work, just the same as before, but now we don't have RefCell::borrow_mut, because we don't have a RefCell

let node = &mut store[node_index];
node.do_thing();

Finally, we have remove_node, which now takes an index to the node we need to remove. You can then use this index to check which nodes to remove.

Now, on the note of removing elements, you can't just blindly remove items from store, because that would invalidate a lot of indices. The way around this is to use a Slab, like slab::Slab.

updated playground with slab

Slab allows you to remove items from it without invalidating existing indices, so that should make it a lot easier to implement remove_node

dannygale · December 6, 2020, 11:11pm

Thank you, this is really helpful. I guess I misunderstood the issue with the nested &mut self references -- it's not the fact that I'm using a &mut self call inside a method that is &mut self, but rather that I'm passing a &mut self reference into a method that's also &mut self?

A few follow-ups:

To answer your question directly, you are aliasing a unique pointer ( &mut T ) ... The way to fix this is by cloning the Rc and passing the clone to remove_node

I can't quite figure out how to make this work.

You don't need Rc<RefCell<_>>

Isn't this exactly what Rc<RefCell<_>> is for?

Now, on the note of removing elements, you can't just blindly remove items from store , because that would invalidate a lot of indices. The way around this is to use a Slab , like slab::Slab

This is the problem that led me to use references instead of indices in the first place. Am I just thinking too hard about this, and the solution is to pass indices around instead? It seems like this leads to an unnecessary layer of abstraction with the slab.

RustyYato · December 7, 2020, 1:10pm

Yes, exactly

First change the signature of remove_node to

                             // \/ no reference
fn remove_node(&mut self, _node: Rc<RefCell<Node>>) {
    // remove the node from both by_addr and by_size, letting the rc go to 0 and the node to be dropped
    unimplemented!();
}

Then you can do something like

let rcnode = self.by_size[i].clone(); // << clone
let mut node = rcnode.borrow_mut();

// do the thing
node.size -= sz;
node.addr += sz;

if node.size == 0 {
    self.remove_node(rcnode);
}

And you won't have any borrowing errors.

That's a good question, it's for shared ownership, but most of the time you can get away wit h a Vec<_> or Slab<_> and indices. There are some cases where this doesn't work (like if you need to share values between different types), but those are relatively rare.

Yeah, that's the downside of using Slab, no solution is perfect, but this is the best solution we have for now. This talk gives some other reasons to use this pattern.

system · March 7, 2021, 1:11pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Circular data structures in rust help	8	1722	April 10, 2020
Multiple mutable references from vector content... how?	6	1585	May 30, 2023
[How to] : Data Structure with inside link and mutability help	5	475	June 27, 2020
Implement m:n relationship between structs help	4	504	May 20, 2021
Lifetime & Borrowing inside a struct method help	3	465	July 4, 2021

Maintaining two orderings of the same data

Related topics