Hmm. I always get super wary when multiple atomic locations are involved, since interleaving based intuition breaks down. But a kind of seqlock could work in this case…
Entirely untested and isn't soundly unique, but it'd look something like:
#![cfg(target_pointer_width = "32")]
static HI: AtomicU32 = AtomicU32::new(0);
static LO: AtomicU32 = AtomicU32::new(0);
// invariant: maximum of (2**31 - 1) threads
// (32 bit std makes this assumption for us)
const MAX: u32 = i32::MAX as u32;
pub fn next() -> (u32, u32) {
let hi = HI.load(Acquire);
let lo = LO.fetch_add(1, AcqRel);
if lo <= MAX {
(hi, lo)
} else {
next_slow(hi, lo)
}
}
#[cold]
fn next_slow(hi: u32, lo: u32) -> (u32, u32) {
if hi == u32::MAX {
abort!("id space exhausted")
}
let Ok(_) = HI.compare_exhange(hi, hi + 1, Release, Relaxed) else {
return next(); // lost the race; retry
};
let Ok(_) = LO.compare_exhange(lo, 0, Release, Relaxed) else {
return next(); // lost the race; retry
};
(hi, lo)
}
Unfortunately, this is vulnerable to an ABA issue common to seqlock constructs — if our HI/LO write is split by other threads incrementing LO all the way back, then it's possible to reset LO twice on the same HI value — and it's not possible to fix this without some kind of lock, whether spin, futex, or mutex. Still untested, but I believe should be soundly unique:
const LOCKED: u32 = u32::MAX;
pub fn next() -> (u32, u32) {
let mut hi = HI.load(Relaxed);
while hi == LOCKED {
atomic_wait_if(&HI, LOCKED);
hi = HI.load(Acquire);
}
let lo = LO.fetch_add(1, Relaxed);
if lo <= MAX {
(hi, lo)
} else {
next_slow(hi, lo)
}
}
#[cold]
fn next_slow(hi, lo) -> (u32, u32) {
let next = hi + 1;
if next == LOCKED {
abort!("id space exhausted")
}
let Ok(_) = HI.compare_exhange(hi, LOCKED, Relaxed, Relaxed) else {
return next(); // lost the race; retry
};
LO.store(0, Relaxed);
HI.store(next, Release);
atomic_wake_all(&HI);
(hi, lo)
}
All that explored, however, Mutex<u64>
is good enough for std and ThreadId
. Just use Mutex<u64>
.