Modeling threaded messages: cannot borrow data in an `Arc` as mutable

Hi there,

I am writing on a software that deals with threaded messages, so basically I have a thread and each thread has messages, and each message may have children. I tried to model it like this:

use std::{
    collections::HashMap,
    sync::{Arc, Weak},
};

#[derive(Debug)]
pub struct Thread {
    pub messages: Vec<Arc<Message>>,
    pub root: Weak<Message>,
}

#[derive(Clone, Debug, sqlx::FromRow)]
pub struct Message {
    pub id: i64,
    pub parent_id: Option<i64>,

    #[sqlx(skip)]
    pub children: Vec<Weak<Message>>,
}

Now I want to write a function that builds the message tree from the flat list of messages:

impl Thread {
    pub fn build_tree(&mut self) {
        let mut index = HashMap::new();

        for message in &mut self.messages {
            index.insert(message.id, Arc::clone(message));

            if let Some(parent_id) = message.parent_id {
                if let Some(parent) = index.get_mut(&parent_id) {
                    parent.children.push(Arc::downgrade(message));
                }
            }
        }

        self.root = Arc::downgrade(&self.messages[0]);
    }
}

But I the compiler throws an error:

error[E0596]: cannot borrow data in an `Arc` as mutable
  --> src/threads.rs:23:21
   |
23 |                     parent.children.push(Arc::downgrade(message));
   |                     ^^^^^^^^^^^^^^^ cannot borrow as mutable
   |
   = help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `Arc<messages::Message>`

I sprinkeled mut annotations everywhere in the hope to fix it, but to be honest I don't really understand the problem. I guess I could solve this with a Arc<Mutex<Message>> instead of a Arc<Message>, but I'd like to avoid that: there will be no concurrent access to the data, so I don't think I need locking and it will just cost me performance.

How can I model this correctly?

Best regards,
CK

You can get mutable access to the value of an Arc through Arc::get_mut, if the reference counter is exactly 1 (only one strong reference to the data). I don't know if that is the case in your example.

Weak references will also prevent get_mut, because they can be upgraded at any time. So the parent pointer makes this infeasible here.


One option would be to use one of the cell types from the qcell crate in place of a Mutex, which provide more fine-grained compile-time access control than plain & and &mut.

Note sure, I read quickly, but I suggest

pub children: Mutex<Vec<Weak<Message>>>

I think you only need to lock children once in your function.

I would use vector indexes rather than Weak references to identify the children. I stay away from Rc and Arc (note that you really don't need Arc here since there is no access from multiple threads) unless I really need them, since this avoids problems with Rust's ownership rules. The same thing applies, even more, to the use of regular references (& and &mut) in data structures.

The code is also easier for me to follow because I don't have to think about breaking cycles with Weak or whether I cause a runtime error due to Arc/Rc runtime checks. Of course you can still cause runtime errors by using an incorrect index, so perhaps that's a wash.

A big benefit for me is that each element doesn't need to be boxed (by Rc or Arc) which reduces allocations. I'd use indexes for that reason alone.

Here's one way to do it with indexes. The main drawback is that an extra parameter is needed since functions that operate on messages will often need access to the vector of messages as well. In those cases you can make the self parameter either the message or the thread; I've made self the thread below.

use std::collections::HashMap;

#[derive(Debug)]
pub struct Thread {
    pub messages: Vec<Message>,
}

#[derive(Clone, Debug, PartialEq, Eq)]
pub struct Message {
    pub id: i64,
    pub parent_id: Option<i64>,
    pub children: Vec<usize>,
}

impl Thread {
    pub fn build_tree(&mut self) {
        let mut id_to_idx = HashMap::new();

        for idx in 0..self.messages.len() {
            let Message { id, parent_id, .. } = self.messages[idx];
            id_to_idx.insert(id, idx);

            if let Some(parent_id) = parent_id {
                if let Some(&parent_idx) = id_to_idx.get(&parent_id) {
                    self.messages[parent_idx].children.push(idx);
                }
            }
        }
    }

    fn root(&self) -> &Message {
        self.message(0)
    }

    fn message(&self, idx: usize) -> &Message {
        &self.messages[idx]
    }

    fn children<'a>(
        &'a self,
        parent: &'a Message,
    ) -> impl Iterator<Item = &'a Message> + 'a {
        parent.children.iter().map(|&idx| self.message(idx))
   }
}
3 Likes

I need Arc because of async, Rc isn't Send.

Your approach is a pretty good idea, thank you very much. I will follow this route.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.