Can I remove this `Box` without causing UB?

I am (for my own practice) attempting to write an async mutex with queuing, with the queue being structured as an intrusive singly-linked list where each task allocates the space for its entry in the queue.

My first attempt looked something like this:

struct QueuingMutex {
    queue_head: AtomicPtr<QueueEntry>,
    // and other fields, including the value locked by the mutex
}

struct QueueEntry {
    next_entry: Option<NonNull<QueueEntry>>,
    // and other fields, including a waker
}

struct AcquireLockTask<'a> {
    queue_entry: UnsafeCell<MaybeUninit<QueueEntry>>,
    // and other fields, including an `&'a QueuingMutex` at the mutex we're queued for
}
impl Future for AcquireLockTask { .. }

The (main) problem with this implementation was that polling the future requires making a Pin<&mut AcquireLockTask>, which causes the queue_entry field to invalidate any NonNulls pointing inside, leading to UB when anyone else tries to walk through the list (e.g. when the task that currently has the lock releases it and tries to wake the next task in the queue). As best I can tell from reading Miri's output, this invalidation happens immediately upon construction of the &mut AcquireLockTask and can't be avoided (even if I never access queue_entry through the &mut).

I have found that if I change the code to look like this:

struct AcquireLockTask {
    queue_entry: Box<UnsafeCell<MaybeUninit<QueueEntry>>>,
    //           ^ note the new `Box` wrapper here
    // and the same other fields as before
}

then my code immediately becomes sound (at least, Miri calls it sound), as an &mut Box<UnsafeCell<..>> doesn't invalidate existing pointers to the inner field. However, I'm now paying for an extra heap allocation every time a task has to wait for the lock, and I'd like to avoid heap allocations in any case, if possible.

Is there a way I can get the ownership semantics of the Box<UnsafeCell<..>> (where an &mut to the struct doesn't invalidate pointers to what's inside a UnsafeCell struct field that isn't accessed through the &mut) without also having to perform a heap allocation? Or is my current approach doomed to always be UB if I don't heap-allocate?

1 Like

Yes I am aware that this also causes UB if mixed with mem::forgeting a stack-allocated AcquireLockTask. I am choosing to ignore that possibility, since this is just a side project for my own learning.

Also, I have other synchronization primitives I didn't share that avoid data races between different threads. I chose not to share them since they aren't relevant to this question, as the UB happens even in single-threaded code.

If AcquireLockTask is !Unpin then this should be prevented (it's a bit hacky though, in the future UnsafeAlias should properly handle this).

Generally the way to do this is to perform and address-dependent action inside poll. This way the pin guarantee prevents it from being leaked.