I'm working on a garbage collector and want to make sure that all of my assumptions with regards to the rust typesystem are correct. Basically, I have the following struct and global:
Any thread can access mutation_buffer_tx. They will only access methods that take an immutable reference to self.
Only one thread at a time will ever access mutation_buffer_rx. The code that accesses the receiver has exclusive access to it.
With those two stipulations holding, is the following code correct?
/// # Safety:
/// This function can be called by any thread at any time, but it will _never_ be called by a thread
/// holding an mutable reference to mutation_buffer_tx
pub fn send_mutation(mutation: Mutation) {
// SAFETY: send takes an immutable reference and is atomic
MUTATION_BUFFER
.get_or_init(MutationBuffer::default)
.mutation_buffer_tx
.send(mutation)
.unwrap();
}
/// # Safety:
/// This function will only ever be called by one thread at a time. Said thread
/// has exclusive access to mutation_buffer_rx.
async fn read_mutation_buffer() -> Vec<Mutation> {
const MAX_MUTATIONS: usize = 10_000;
let mut mutations: Vec<_> = Vec::with_capacity(MAX_MUTATIONS);
// SAFETY: This function has _exclusive access_ to the receive buffer.
unsafe {
(*MUTATION_BUFFER
.get_or_init(MutationBuffer::default)
.mutation_buffer_rx
.get())
.recv_many(&mut mutations, MAX_MUTATIONS)
.await;
}
mutations
}
Besides concerns relating to ensuring those invariants described are held, are there any obvious pitfalls to this code? I think that everything is correct, but I'm not an expert on UnsafeCell so would love to have any feedback.
An async task doesn't get to run on its own thread. The async runtime may constantly switch around which task is on which thread. Therefore, I assume that when you say "thread", you actually meant "task".
If the conditions in the comments hold, then the code will not cause undefined behavior. However, safe functions (functions that aren't marked as unsafe) should never cause undefined behavior no matter what you throw at it. Therefore, the read_mutation_buffer function should be marked as unsafe.
Although... how do you plan to ensure in the first place that only one task calls read_mutation_buffer at a time? That seems very error-prone.
Fully agree with this comment. In the vast majority of cases, you’d be better served by mutation_buffer_rx: Mutex<UnboundedReceiver<Mutation>>: Uncontested locks don’t add much overhead at all, and it adds a safety net in case your external synchronization code, which is always a bit tricky to get right, has flaws.
I understand that. In this instance I do mean thread as that's more salient to the safety discussion. It doesn't matter which thread, or if the next thread is the same thread, only that the function is only being called by one thread at any time which has exclusive access.
Good point, didn't think about that. A lot of the functions in this module are private and only callable from within that module so I didn't really think about that, but you are absolutely correct.
The only path that reaches read_mutation_buffer is through run_garbage_collector, and that task lives for the lifetime of the program.
I absolutely agree, but this is a garbage collector (overhead is important), and I think that as of right now this synchronization (or lack thereof) is pretty bullet-proof (I could be wrong!)
In that case, I’d probably use something like Mutex<Option<Receiver>> and take() it once at the top of run_garbage_collector. That ensures the unique access even if something tries to call it outside of the OnceLock and should remove the need for both manual unsafe and any locking at all once the startup phase has completed.