Ideas to reduce lock contention?

I'm looking for suggestions on how to address lock contention in adding items to a HashSet protected by a Mutex. For context see this issue.

The idea is to precompute the hash before taking the lock, so we can hold the lock for a shorter period of time in cases where the data may be large and take a while to hash. The problem is that to compute the hash we need access to the RandomState stored in the HashSet that is protected by the lock. I know I can create a HashSet with a provided RandomState, which I could store outside the Mutex, but I'm not sure where I could store it that wouldn't end up being protected by a lock that just slows things down again.

Ideally I'd want an atomic RandomState, so I could access it safely and efficiently, with a slow path that initializes the RandomState is it hasn't yet been initialized. I suppose in the context I'm looking at maybe I could write to a global RandomState (using unsafe) while holding the Mutex the first time I create the HashSet?

Ideas are welcome, or statements that it would definitely be safe to use unsafe to write to a global variable while holding a separate Mutex, and then to read it with no lock.

If you're using a specific RandomState, it's immutable. You only need a &RandomState to hash things (https://doc.rust-lang.org/std/hash/trait.BuildHasher.html#method.hash_one) so I'd expect you can just share it un-mutexed.

1 Like

I guess part of my question is how to share it un-mutexed. I suppose I just create a static variable and then unsafely initialize it when I first need to? But I don't see any const way to create a RandomState, so I guess I can't have a static variable of this type... I could have a static raw pointer to a RandomState, I suppose, or a static AtomicPtr to a RandomState. Either way reading the RandomState requires unsafe. Is there a better approach?

You could use the once_cell crate: once_cell - Rust

Since you mention unsafely initializing it, here's how you can do that:

use std::cell::UnsafeCell;
use std::mem::MaybeUninit;

struct GlobalRandWrapper(UnsafeCell<MaybeUninit<RandomState>>);
unsafe impl Sync for GlobalRandWrapper {}

static GLOBAL_RAND: GlobalRandWrapper = GlobalRandWrapper(UnsafeCell::new(MaybeUninit::uninit()));

// Safety: This function may be called at most once.
pub unsafe fn set_global_rand(value: RandomState) {
    let ptr = GLOBAL_RAND.0.get();
    *ptr = MaybeUninit::new(value);
}

// Safety: This function may only be called after `set_global_rand` has been called.
pub unsafe fn get_global_rand() -> &'static RandomState {
    let ptr = GLOBAL_RAND.0.get() as *const RandomState;
    &*ptr
}