How to make this thread safe?

Edit: cache Box<u8> instead of u8.

I'm trying to simulate CPU cache with memory. This is my attempt:

use std::cell::UnsafeCell;
use std::collections::HashMap;

pub struct Cache {
    inner: UnsafeCell<HashMap<u8, Box<u8>>>
}

impl Cache {
    pub fn get(&self, key: u8) -> &u8 {
        // SAFETY: this &mut _ is only used to insert missing value
        unsafe { &mut *self.inner.get() }.entry(key).or_insert(Box::new(0))
    }
}

(Playground)

If cache missing, try to insert (load) one.

This is safe if there is only one thread. The problem is that I want to make Cache Sync.

I'm trying ensuring that, in my real practice, an AtomicU8 flag is used to ensure none can break Rust's rule just like RefCell. However, problem comes with the Cache::get detected by loom:

If multiple threads find key not existing, all of them will insert one.

How can I avoid this concurrent bug?

I tried loop waiting a may_loading: AtomicBool flag, but this leads loom not workable, which is not I want.

I also thought about CondVar, but it need a Mutex to work with.

If just replace UnsafeCell with Rwlock, then I cannot return &u8 from LockGuarde limited in the scope.

Here is the real code:

For every UnsafeCell you think you need, there's a better interior mutability abstraction that you should use instead.

In your case, that's a Mutex.

No it isn't.

The or_insert call may re-allocated the hashmap to grow it, which invalidates prior &u8 pointers you've created.

fn main() {
    let c = Cache {
        inner: UnsafeCell::new(HashMap::new()),
    };

    let refs: Vec<_> = (0..10).map(|i| c.get(i)).collect();
    dbg!(refs);
}

example output:

[src/main.rs:20:5] refs = [
    48,
    0,
    86,
    86,
    86,
    48,
    48,
    0,
    0,
    0,
]

the non-zero values are from use-after-free accesses, and miri also reports this, of course, as

error: Undefined Behavior: constructing invalid value: encountered a dangling reference (use-after-free)

2 Likes

As for prior art, see crates such as 'elsa' for collections with comparable API (i.e. a by-&self-reference insert method). This crate will ensure correctness by enforcing double indirection. E.g. to retain a reference, you'd need to use something like FrozenMap<u8, Box<u8>> here.

pub struct Cache {
    inner: UnsafeCell<HashMap<u8, Box<u8>>>
}

impl Cache {
    pub fn get(&self, key: u8) -> &Box<u8> {
        unsafe { &mut *self.inner.get() }.entry(key).or_insert(Box::new(0))
    }
}

Oh, let's change it to Box<u8> then :sob:

pub fn get(&self, key: u8) -> &Box<u8>

The point was that get would still return &u8, so that that reference's address does not get invalidated anymore :wink:

I tried something similar, the Rwlock... But I cannot return &xx from it since the Guard I got goes out of scope, so as the &xx from the Guard doesn't live long enough.

Congratulations, Rust just saved you from a concurrent mutation bug!

(You'll need to return the guard. There's no other way. That's the point.)

I used to, :laughing: :

But I just want to remove all the locks...

I suppose you could combine the Box and insert-only-API trick, using unsafe code, together with an RwLock that acquires short-lived access to the map anyways, read-only for reads, read-write for inserts.

The easy safe alternative (with slightly more overhead for reference counting) would probably be something like to use RwLock<HashMap<u8, Arc<u8>>> and you just clone the Arc to get a longer-lived "reference" to the target?

1 Like

Well, to eliminate all locking, while staying concurrent, you'd probably need to search for other implementations of concurrent hash maps instead. No wrapping of std's HashMap can do that, as for any mutation you need a lock, right?

I haven't really looked deeply into the matter, but crates should definitely exist. Perhaps something like this (found that by just googling). Such APIs do usually involve guard objects, and you'll need to read the documentation carefully to learn of any more or less fine-grained locking that may remain, when you're planning to hold onto such guard objects for a long time.

For append-only structures, e.g. this crate also has something; such data structures do not need a guard object. (But glancing at your code, that take_or_default doesn't look particularly append-only to me.)

1 Like

Mine is just simulate the CPU cache.

When reading:

  1. cache missing, then loads from disk
  2. cache hit, then returns

When writing:

  1. cache missing, then writes to disk, and loads from disk
  2. cache hit, then writes to cache, mark it dirty

After all things, writes to disk if marked dirty..

This seems has lock too.

In my use case, loading multiple times may be not a big problem, however. I just want to learn how to test it with loom, and it seems no way if I cannot wipe out the race condition...

This was just supposed to be an example for "has no guard objects" :wink:

Not for "is completely lock-free".

It is not clear to me whether your goal is "high-performance, use atomics, avoid locks, even locking for a short time and in cases that are unlikely to happen" or just "no logical locks between guard objects, so I don't run into dead-locks or long periods of unwanted mutual exclusion". If it's just the latter, than implementations based on std's HashMap can be fine, and the OnceMap crate I've linked can be fine probably, too.


As for what's the correct data structure for "simulating CPU caches" and whether an append-only data structure can work; I haven't given that much thought yet.

2 Likes

After many days working, I finished this. However, full of locks :face_holding_back_tears:

All those is because, I misunderstood Atomic types. Both the documentation and the Rustonomicon said, Ordering::Release means changes will be seen by those Acquire, Acquire will see those Release, without emphasizing this should be limited in the same thread. So I just believe Atomic types will magically synchronize everything, and simply mis-used it as lock alternative. That's why I think I can make this without any lock :laughing: .

No, that would make atomics completely useless – that sort of guarantee already exists in straight sequential code. The whole point of atomics is that they do provide guarantees across threads.

Oh, I mean this is not the right use case.

use std::thread;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;

fn main() {
    let b = Arc::new(AtomicBool::new(false));
    let b1 = b.clone();
    let jh1 = thread::spawn(move || {
        thread::sleep(std::time::Duration::from_secs(1));
        b1.store(true, Ordering::Release);
    });
    let b2 = b.clone();
    let jh2 = thread::spawn(move || {
        assert!(b2.load(Ordering::Acquire));
    });
    jh1.join().unwrap();
    jh2.join().unwrap();
}

I used to think this can make thread2 wait thread1...

Edit:
here is all I learned about Atomic.

So the case at the top is completely wrong understanding of Atomic, which I had before.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.