Safety of a weird concurrent structure

Hey!

In my current project, I need to allow multiple (potentially a lot of) readers to access a value (of arbitrary type) while a single writer can write to it at any time. I've got a weird constraint, however: those readers are not allowed to perform any writes to the value, including acquiring a lock, incrementing an epoch counter, etc. The accessed structure is mapped in their address space as read only memory.

I've been thinking of ways to achieve this and I've come to this answer:

struct TryMutex<T> {
    value: UnsafeCell<MaybeUninit<T>>, // might be in a weird state, don't assume init
    epoch: AtomicUsize, // even when available, incremented on start and end of writes
}

impl<'a, T> TryMutex<T> {
    /// # Safety
    ///
    /// Can only be called from one thread. There can only be one writer.
    ///
    /// This limitation can probably go away by replacing the first fetch add
    /// with a compare_exchange.
    unsafe fn with_lock<F, R>(&self, f: F) -> R
    where
        F: FnOnce(&mut T) -> R, // T cannot escape the function!
    {
        // Bump the epoch, ensuring that readers are notified that the value is changing.
        // AcqRel ensures that operations after this point don't get re-ordered before
        // (Acq) and that the execution of `f` will be visible to those who use Acquire
        // ordering (Rel).
        self.epoch.fetch_add(1,  AcqRel);
        
        // SAFETY:
        //  We aquired the "lock"!
        let ret = f(unsafe { (&mut *self.value.get()).assume_init_mut() });
        
        // Release ensures that the previous operations won't be reordered after the
        // store. We don't need to worry about the load because we already know that
        // we're the only one writing (since we have the lock).
        self.epoch.fetch_add(1, Release);
        
        ret
    }

    fn try_get(&self) -> Option<T>
    where
        T: Copy, // T must be Copy, we can't allow long lived references
    {
        let epoch = self.epoch.load(Acquire);
        if epoch % 2 == 1 { return None; } // currently being written to

        // SAFETY:
        //  Value is "maybe uninit", so we're not assuming anything just yet.
        let ret: MaybeUninit<T> = unsafe { *self.value.get() };

        if self.epoch.load(Acquire) == epoch {
            // SAFETY:
            //  Value is indeed init, the epoch has not changed.
            //
            //  Race is still possible if the counter incremented `usize::MAX`
            //  times during the read. Unlikely enough.
            Some(unsafe { ret.assume_init() })
        } else {
            None // we lost the race :(
        }
    }
}

I've got two questions:

  1. Does this exist already? If so, how is this called?
  2. If not, would this be safe? Did I get the atomic orderings right?

Ok that's four questions, but you got the idea x)

Thanks!

No, this is not sound. Consider the following execution:

thread1(try_get)   thread2(with_lock)
epoch.load()
                   epoch.fetch_add(1)
*value.get()       &mut *value.get()   <-- data race!

It doesn't matter what comes after this. Once you have a data race, you have UB.

1 Like

The assume_init_mut calls is unsound is the value could be uninit before calling f

f could panic, in which case the value of epoch would become inconsistent.

Now, the hard part, this is technically a data race and a violation of Rust's aliasing guarantees. I'm not sure what the ideal way to fix this would be, but my guess is that you likely want to only ever write and read the value using an atomic memcpy.

1 Like

This seems sketchy especially in the light of UnsafeCell, which is specifically a shared mutability tool; with UnsafeCell, it is exactly the point that you can write through a shared reference.

Another thing that sticks out to me is that you are using MaybeUninit, but there's nothing in the code that initializes it – but admittedly, there's nothing that would suggest that it's ever uninitialized, either. (This, however, only adds to the overall weirdness of the code.)

fn new(val: T) -> TryMutex<T> {
    TryMutex {
        value: UnsafeCell::new(MaybeUninit::new(val)),
        epoch: AtomicUsize::new(0),
    }
}

The value is supposed to always be initialized. The MaybeUninit<T> was mostly a way for try_get not to observe the inner value until it knew for sure that no datarace occured. It's a value that might be in be in the middle of an operation an we can't "assume_init" that.

Reading your responses, I realise that I was basing my code on an assumption: "having a data race inside a MaybeUninit<T> is fine, as the value isn't even guarenteed to mean anything in the first place." The idea was to check after the race whether the thing that we read had raced or not, thus "assume_init"ing it. To "resolve" the race afterwards.


Is there some kind of mechanism to make this behavior defined? I feel like the idea works on paper, but I don't have a way to express it in code.

You can do something along these lines:

  1. Store two copies of the value.
  2. All modifications are made to one of the copies.
  3. After modifying the value, you copy it over to the other value with an atomic memcpy.
  4. When reading the value, you also use an atomic memcpy.

This way, the data race disappears because both threads use an atomic memcpy.

All that remains is to fix your orderings. You need a release operation to occur after reading, because you need the acquire fetch_add in the next with_lock call to synchronize with something that comes after your read. You can do that by using a compare_exchange(epoch, epoch) with at least a release ordering on success. A fetch_add(0) operation would also work.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.