Simple lock-free immutable concurrent data structure

I needed something to allow me to track progress of concurrent work loads, and so I conceived of this data structure:

pub struct Oncewrite<T> {
    data: UnsafeCell<Vec<Option<T>>>,
    length: AtomicU64,
}
unsafe impl<T: Sync> Sync for Oncewrite<T> {}

impl<T: Default + Clone> Oncewrite<T> {
    pub fn new(size: usize) -> Self {
        Oncewrite {
            data: UnsafeCell::new(vec![None; size]),
            length: AtomicU64::new(0 as u64),
        }
    }
    pub fn iter(&self) -> impl Iterator<Item = &T> {
        let raw = self.data.get();
        unsafe { (*raw).iter().filter_map(|x| x.as_ref()) }
    }
    pub fn insert(&self, index: usize, value: T) -> anyhow::Result<()> {
        let raw = self.data.get();

        unsafe {
            if (*raw).len() <= index {
                return Err(anyhow::anyhow!("Index out of bounds"));
            };
            if (*raw).get_unchecked(index).is_some() {
                return Err(anyhow::anyhow!("Value already set"));
            };
            (*raw).get_unchecked_mut(index).replace(value);
            self.length.fetch_add(1, Ordering::Relaxed);
        }

        Ok(())
    }
    pub fn get(&self, index: usize) -> anyhow::Result<&T> {
        let raw = self.data.get();
        unsafe {
            if (*raw).len() <= index {
                return Err(anyhow::anyhow!("Index out of bounds"));
            };
            if (*raw).get_unchecked(index).is_none() {
                return Err(anyhow::anyhow!("Value not set"));
            };
            Ok((*raw).get_unchecked(index).as_ref().unwrap())
        }
    }
    pub fn get_length(&self) -> usize {
        self.length.load(Ordering::Relaxed) as usize
    }
}

I've done some basic testing and it should work and be sound. Any feedback? Is there already a well-known solution that already does this?

Looks like with this implementation, concurrent calls to insert with the same index would result in a data race?

2 Likes

Also needs T: Sync + Send for OnceWrite<T>: Sync.

Maybe just try an implementation without unsafe code using Vec<OnceCell<T>>? OnceCell in once_cell::sync - Rust

3 Likes

This is the primary constraint of the data structure--you should only ever insert unique indices. I don't think there is an idiomatic way to express this constraint. As you can see I've made an attempt to block modifying an index after it has already been written to, but you are correct that there would be a potential data race since we cant (to my knowledge) guarantee atomicity of the insertion operation without locks (which is the whole point of this). Do you know if Oncecell is lock-free?

Safe functions should be sound even if they're used incorrectly. insert should be marked unsafe since it's only conditionally sound.

2 Likes

OnceCell only blocks during initialization. Otherwise, it's just a single atomic load.

2 Likes

And it should only block rarely: for this data structure, I believe the only blocking case would be if you want to .get a value while another thread has already started but not yet finished the .insert in the same location / at the same index. And even then, the .insert is such a fast operation that the blocking would be very short, or possibly often even just a few spins of yield operations without parking the thread.

Also, the blocking only happens in cases that would have been data races (i. e. UB) with the Option-based implementation, so I don't see much of a disadvantage at all.

1 Like