I’m working with a bit of code where I want to store references between objects and weak back-references. At first I had it working with Rc and RefCell types, but now I realize I’m A) sending these objects across thread in the GUI, and B) going to eventually want to operate on the logical Vec<T> in parallel.
Enter Arc and Mutex, which I’ve effectively replaced for Rc and RefCell one-to-one.
Now I’m wondering if there’s a better way to do this while preserving the ability to mutate the Vec and also mutate the inner T objects. I’m sorta guessing I’m stuck with this, which if true, leaves me wondering if there are patterns I can use that make this less of a mess of unwrap or PoisonError handling code.
It's a bad idea to have lots of Mutexes without a specific plan for how you use them, because two threads could try to lock two mutexes in the opposite order and deadlock with each other. This is one of the ways in which concurrent programming is different in a way sequential programming is not.
You should try to structure your program in a way which has fewer Mutexes (or at least, Mutexes more encapsulated inside some other abstraction distant from your application code) and fewer back-references requiring them. There are good reasons sometimes to have all of the things you mentioned, but if you have them everywhere you’re going to have a fragile program.
If the purpose of all this is to allow user interface to manipulate objects, consider using channels instead — instead of having, say, a button callback that locks a mutex and performs a mutation, have it send a message on a channel that will be delivered to the owner of the application state, which can then use ordinary borrowing to reach into the right part of the state. (Some Rust GUI frameworks have messages/events built into their core so that you can do this without setting it up yourself.)
If the purpose of all this is to allow user interface to manipulate objects, consider using channels instead
I already use a channel to communicate, but when I tried to do something like this:
self.rt.spawn(async {
let lot = Record::new(...).save().await.unwrap();
})
I get an error like: NonNull<Vec<Rc<RefCell<Record>>>> cannot be sent between threads safely within {async block@src...
Without async I understand how I’d keep things local, but here I’m not sure.
At the end of the day, I’m trying to make this structure; a Lot which holds a collection of Record structures, which each can independently access their owning Lot. The Lot has an encryption key which the Records need to encrypt themselves. Each object is backed by a table in an SQLite database, so it’s nice to have them be somewhat independent, though now I’m thinking all this work trying to make the Record able to act on it’s own may not be worth it, as much as I’d like it to.
Perhaps I just go for Lot::create_record(&self), Lot::update_record(&self, record: Record), instead of Lot::new_record(&self).save(). Then in the GUI I’ll have records assigned to each row and when they change the containing Lot will take in the structure and call update_record on it.
Avoiding Rc/Arc types and interrior mutibilty would be nice, but having independence for my objects would really be nice too.
This assumption that you can mutate any object from any thread can get messy, in addition to having obvious overhead.
Alternatives unfortunately require rethinking of the architecture, and there aren't any simple obvious answers, especially for GUIs
For example, you could have queues for modifications to be applied. Split processing into read-only multi-threaded gathering of things to change, and a separate update stage that applies the changes. That’s now ECS in games works. That’s how egui kinda works.
The queue could be implemented as an mpsc channel
Or, if you can, split the data into multiple independent islands, so that threads can be given exclusive mutable access to part of the data.
Or sometimes you can flip the problem inside out, and change data processing from scatter to gather. In GPU programming this is often necessary. Reading sequentially and updating multiple arbitrary locations requires locking, but writing sequentially while reading from multiple locations can be parallelized without locking.
I guess I can handle updates basically in a single thread, since it’s all just reactions to events from a user like updating a string and pressing save, which is leading me back to avoiding Rc and RefCell altogether… but it just feels wrong to have to search through a records vector for the correct record to update, when I could have shared that record and updated it in place. I don’t expect there to be any real racing access patterns, but famous last words.
I don't understand how what you are showing relates to what I suggested. Where's the channel in your code? How does this code relate to the GUI?
(You can use a single-threaded async executor to avoid the Send requirement, but even then, if you borrow a RefCell across an await in a spawned task then you're likely to end up with a run-time borrow conflict — same problem as lots of Mutexes in a different shape.)
Sorry for not being more clear, I posted that showing what motivated me to use Mutex not because I have any concurrency I’m actually trying to achieve. Though it would be nice to allow each record to be decrypted in parallel, that’s can be done without any reference cycles.
I think fundamentally, regardless of the Send requirement, back-references just might not be worth it for this design, as much as I would like to be able to say record.lot.key().
I’m pretty sure I could make it work, without deadlocking since I' have pretty direct control over when I’d lock the Vec and when I’d lock the Record structures, but it’s cumbersome and somewhat unnessecary.
or at least, Mutexes more encapsulated inside some other abstraction distant from your application code
I’m curious how that might look?
What I’d absolutely love, and have wanted for years, is a way to load a relational structure from a DB with FOREIGN KEY associations mapping to Rc<RefCell> or Arc<Mutex>-like references in Rust.
Ah. Let me clarify: I also recommend not using a RefCell-everywhere architecture. The trouble will come in the form of panics instead of deadlocks, and be more deterministic, but there are still lots of ways to get into trouble, and you also find yourself limited by not being able to borrow as flexibly.
But if you do not need any concurrency, why are you calling spawn()?
You would write some struct that owns a Mutex (or RefCell). Instead of Arc<Mutex<MyStruct>>, you have
struct MyStruct {
state: Mutex<MyStructInner>,
// ... "immutable" fields go here, if any
}
struct MyStructInner {
// ... fields that you want to mutate go here
}
and then — this is key to making any difference — you never return a MutexGuard (or Ref) from any method on MyStruct, and you never call a callback of any sort, or any function that takes any other lock, while the Mutex (or RefCell) is locked/borrowed. These conditions are sufficient to ensure that the mutex/cell is “just an implementation detail” and cannot cause a deadlock or panic.