"Write-once" static Option

I'm trying to create an Option-like type which can be used in "write-once" way in static variable. The gist of what I want is here (playground):

pub struct UsizeHolder(Option<usize>);

impl UsizeHolder {
    pub fn set(&mut self, val: usize) -> Result<(), (usize, usize)> {
        self.0.map_or_else(
            || {
                self.0.replace(val);
                Ok(())
            },
            |old_val| Err((old_val, val)),
        )
    }
    pub fn get(&self) -> Option<usize> {
        self.0
    }
}

But, to safely use this in static, I need Mutex, and Mutex can be created only in lazy_static. Will this be the best way, or I can do something more efficient with manual usage of atomics?

1 Like

If you just want something that works, consider using once-cell.

The TL;DR overview is use static mut or UnsafeCell or some other low-level shared mutability primitive, then wrap it in a safe API using atomics.

2 Likes

Here you should have all the info needed: http://docs.rs/once_cell

The basic idea is that you need some kind of flag, atomic for a non thread_local! static, which ensures that the write only happens once.
And then the part being written, since it is technically shared by virtue of being in a static, in order for it to be mutable Interior Mutability / Shared Mutability is needed, in its raw variant (given that you are manually using a flag to avoid concurrency issues): hence UnsafeCell and unsafe are needed.

Here is a sketch:

use ::std::{
    cell::UnsafeCell,
    mem::MaybeUninit,
    sync::atomic::{
        Ordering,
    },
};

#[allow(bad_style)]
mod InitState {
    #[repr(u8)]
    enum States {
        Uninit,
        Initializing,
        Init,
    }
    pub(in super) type Repr = u8;
    pub(in super) type AtomicRepr = ::std::sync::atomic::AtomicU8;

    pub(in super) const UNINIT: u8 = States::Uninit as _;
    pub(in super) const INITIALIZING: u8 = States::Initializing as _;
    pub(in super) const INIT: u8 = States::Init as _;
}

pub
struct Holder<T> {
    init_state: InitState::AtomicRepr,
    value: UnsafeCell<MaybeUninit<T>>,
}

impl<T> Holder<T> {
    pub
    const
    fn uninit () -> Self
    {
        Self {
            init_state: InitState::AtomicRepr::new(InitState::UNINIT),
            value: UnsafeCell::new(MaybeUninit::uninit()),
        }
    }

    /// Returns `true` iff the initialization was successful
    pub
    fn init (self: &'_ Self, value: T)
      -> bool
    {
        if  self.init_state
                .compare_and_swap(
                    InitState::UNINIT,
                    InitState::INITIALIZING,
                    Ordering::Acquire,
                )
            != InitState::UNINIT
        {
            return false;
        }
        unsafe {
            // Safety: init_state guarantees that at_most one thread sees this
            self.value
                .get()
                .write(MaybeUninit::new(value));
        }
        self.init_state
            .store(InitState::INIT, Ordering::Release)
        ;
        true
    }

    pub
    fn get (self: &'_ Self) -> Option<&'_ T>
    {
        if  self.init_state
                .load(Ordering::Acquire)
            == InitState::INIT
        {
            Some(unsafe {
                // Safety: init_state checks ensures the read happens *after* the write,
                (&*self.value.get())
                    .get_ref()
            })
        } else {
            None
        }
    }
}

/// # Safety
///
///   - The concurrent API is synchronized through atomics
unsafe impl<T : Send + Sync> Sync for Holder<T> {}
unsafe impl<T : Send> Send for Holder<T> {}
2 Likes

Thanks, seems that the three-stated atomic flag was the missing piece (I was trying to put it together using AtomicBool, which is of course the wrong path). However, AFAIK in my case it's possible to go without unsafe (except impl Sync) - there's no requirement that the memory is really uninitialized at the beginning, and Cell doesn't seem to have any overhead over UnsafeCell if the type in question is Copy (in my case, it's really simply usize), so that's the variant I'm going for:

use ::std::{cell::Cell, sync::atomic::Ordering};

#[allow(non_snake_case)]
mod InitState {
    #[repr(u8)]
    enum States {
        Uninit,
        Initializing,
        Init,
    }
    pub(super) type AtomicRepr = ::std::sync::atomic::AtomicU8;

    pub(super) const UNINIT: u8 = States::Uninit as _;
    pub(super) const INITIALIZING: u8 = States::Initializing as _;
    pub(super) const INIT: u8 = States::Init as _;
}

pub struct UsizeHolder {
    init_state: InitState::AtomicRepr,
    value: Cell<usize>,
}
impl UsizeHolder {
    pub const fn new() -> Self {
        UsizeHolder {
            init_state: InitState::AtomicRepr::new(InitState::UNINIT),
            value: Cell::new(0),
        }
    }
    pub fn set(&self, value: usize) -> Result<(), ()> {
        if self.init_state.compare_and_swap(
            InitState::UNINIT,
            InitState::INITIALIZING,
            Ordering::Acquire,
        ) != InitState::UNINIT
        {
            return Err(());
        }
        self.value.set(value);
        self.init_state.store(InitState::INIT, Ordering::Release);
        Ok(())
    }
    pub fn get(&self) -> Option<usize> {
        if self.init_state.load(Ordering::Acquire) == InitState::INIT {
            Some(self.value.get())
        } else {
            None
        }
    }
}
unsafe impl Sync for UsizeHolder {}

// Simple test for correctness
fn main() {
    static VAL: UsizeHolder = UsizeHolder::new();

    assert!(VAL.get() == None);
    assert!(VAL.set(1).is_ok());
    assert!(VAL.get() == Some(1));
    assert!(VAL.set(2).is_err());
    assert!(VAL.get() == Some(1));
}

Playground

You can avoid all unsafe if you use AtomicUsize with relaxed loads instead of Cell.

4 Likes

That's a HUGE exception, actually. All the soundness of Cell, a convenience wrapper around UnsafeCell, is based on it being !Sync. So by asserting that a wrapper containing it is Sync, you are basically hiding raw UnsafeCell shenanigans under the carpet. It's not wrong, of course, but it makes the unsafe-ty too sneaky for my taste, since it doesn't scream "careful with the code of the whole module" as much as using UnsafeCell does. I prefer to be overzealous with this kind of things :wink:

  • TL,DR: there is no such thing as "no unsafe except ..."

  • @matklad's suggestion of AtomicUsize with Relaxed loads, while limiting the value to usize-sized values (which seems to be your usecase), does manage to get rid of all the unsafety indeed :slight_smile:


Your Cell suggestion, on the other hand, is quite nice when you are not multi-threaded: you can then use it within a thread_local! static, and get rid of atomics altogether:

use ::core::cell::Cell;

pub
struct UsizeHolder {
    value: Cell< Option<usize> >,
}

impl UsizeHolder {
    pub
    const
    fn new () -> Self
    {
        UsizeHolder {
            value: Cell::new(None),
        }
    }

    pub
    fn set (self: &'_ Self, value: usize) -> Result<(), ()>
    {
        if self.value.get().is_some() {
            return Err(());
        }
        self.value.set(Some(value));
        Ok(())
    }

    pub
    fn get (self: &'_ Self) -> Option<usize>
    {
        self.value.get()
    }
}

// Simple test for correctness
fn main ()
{
    thread_local! {
        static VAL: UsizeHolder = UsizeHolder::new();
    }
    VAL.with(|it| {
        assert_eq!(it.get(), None);
        assert!(it.set(1).is_ok());
        assert_eq!(it.get(), Some(1));
        assert!(it.set(2).is_err());
        assert_eq!(it.get(), Some(1));
    });
}
3 Likes