I am building a database-like system in which nodes of a B+tree-like structure are managed by a 'page cache', currently a moka::Cache<usize, Arc<Node<T>>.
(for the purposes of this post, a Node<T> can be considered to be just a large contiguous array, like e.g. a [T; 4069]. T might itself need special drop glue, like being a Box<str>.)
Nodes are fetched from disk if they are currently not in the cache. Passed out from the cache are arcs, to make sure we do not copy the large blocks of memory around, since we're only using them for reading.
Threads hold on to one (or multiple) of these arcs until their iteration over (a subrange of) the contained values is done.
The problem is: At some point, when the cache becomes too full, the least-recently-used Arc<Node<T>> will be removed from the cache. It is possible that one or more threads still hold on to this arc at this time.
This means that when the thread is finished with this arc, suddenly it is responsible for cleaning up the large contained structure and all elements within.
(A different but similar situation arises when a thread needs new data from disk and this data needs to be inserted in a full cache, this thread suddenly also becomes responsible for cleaning up the least-recently-used value that will be thrown out.)
I would like to defer this destruction to a background thread (or to 'later' on the current thread when the current thread no longer is busy). crossbeam_epoch seems to provide a way to do this.
But I am unable to figure out how to use a a std::sync::Arc (or maybe an triomphe::Arc; we do not need weak references) together with crossbeam_epoch to make this work.
How to do this?
Should you even be using an Arc if you're using crossbeam_epoch?
If not, what alternative type should be used as value type in the cache?
It looks to me like crossbeam_epoch requires you to use it's Atomic type, which is a fair bit more complex than an Arc to use.
A simple fix that might be good enough for your case is to send the Arc you remove from the cache to a background thread that keeps a list of nodes that need to be dropped eventually. Periodically the thread can check the list for nodes that have a ref count of 1, and then drop those Arcs[1].
Ideally I think you'd have a custom Arc that sends a message to your background thread instead of dropping itself, so you don't have to waste time polling the list. You could use a wrapper type that sends the message in its Drop impl, but then your Node type needs to be in a separate allocation which may not be acceptable.
This can be done without any separate allocation, just an Option:
struct Handle<T>(Arc<Droption<Node<T>>>);
impl<T> Deref for Handle<T> { /* deref all the way to the Node, if you want */ }
struct Droption<T>(Option<T>);
impl<T> Drop for Droption<T> {
fn drop(&mut self) {
DEFERRED_DROP_CHANNEL.send(self.0.take());
}
}
That'll depend on the application, but at least it's only a move, and doesn't involve any of the transitive drop work like dropping the T = Box<str>s @Qqwy mentioned.
Thank you very much, @semicoleon and @kpreid! I was very much overthinking this.
I think I'll create my own Arc alternative which will when dropping the last instance, instead of immediately cleaning up its internal allocation, pass its internal NonNull<T> to a background thread (either a 'real' thread using std::thread::spawn or a lightweight one e.g. tokio::spawn_blocking) where its drop will then be called.
So like the Droption example but while making sure that the large object itself also does not need to be moved around.
(Assuming, it's going to be called from multiple threads in the first place) this code has a race condition because checking the strong count on the if condition and dropping the Arc at the end of the function happens in sequence and not atomically.
I'm not sure there's a way to avoid the race condition. If the value T was moved out of its Arc, the (unstable) Arc::into_inner can offer a non-race-condition approach, unfortunately not yet in stable Rust though.
I believe, in the presence of weak pointers, it's also possible that the strong count could increase again after the if condition saw it being 1. Which might result in unexpected behavior. So an approach that immediately and atomically deconstructs the Arc such as Arc::into_inner does, may be beneficial anyways.
I did mention the point about Weak in the post being replied to there, so I think they were assuming Weaks weren't involved. If there are only strong references and the count is one, is it possible for there to be a race condition there? The Arc is passed by value so I think not? At least not without other unsafe involved.
Ah, sorry for causing confusion about weak. I should have mentioned, that I did not read all the discussion in this thread in detail; thanks for pointing out relevant information I missed.
The race condition is independent from Weak. Imagine two threads executing the discard_arc on two clones of the same Arc in parallel (with no forthrt references/closes existing). Then the expected behavior is that one of the threads will have the "last copy" of the Arc and thus send it to the trash thread. However, it's bossibke that
first, but threads check the strong_count, and both see 2
then both threads leave the if expression whose condition evaluated to false, both threads return from the function and upon doing so, both threads will drop the val: Arc<T> argument
since both threads drop their copy of the Arc, one of them will have the actual last copy, so they will start dropping the T themself instead of leaving that task to the trash thread.
That’s an unfortunate scenario— It makes this code provide “best effort” background dropping with no real guarantees, which will be sufficient for some applications and unacceptable for others.
The simplest way to guarantee the drop is performed in the background thread is to send the Arc across the channel unconditionally. This will add significant communications overhead for some workloads, potentially negating the speedup gained from background dropping.
I missed the race condition that @steffahn pointed out, but otherwise I don’t think Weaks pose too much of a problem here— If they are upgraded in a context where dropping the inner value would be problematic, the resulting Arc simply needs to be handed off to the TrashThread as well.
Generally, when I’ve used a scheme like this, it’s been in a situation where I had only one thread with real-time constraints (a graphics renderer, say) and all of the others provided non-real-time/blocking services for it, and only drops from the real-time thread need the extra attention provided here.
I ended up writing a little crate for this, backdrop. It is able to wrap any normal type T in a Backdrop<T, Strategy>, where Strategy is a compile-time-only marker type to decide what approach to take when dropping T.
Some examples (from running cargo run --example comparison --features=tokio, which repeatedly 'backdrops' a Box<[Box<str>; 5_000_000]> with various strategies):
none, took 99.785167ms
fake backdrop, took 91.171125ms
thread backdrop, took 184.708µs
trash thread backdrop, took 22.333µs
(single threaded) trash queue backdrop, took 21.542µs
(single threaded) trash queue backdrop (actually cleaning up later), took 87.6455ms
tokio task (multithread runner), took 33.875µs
tokio blocking task (multithread runner), took 55.875µs
tokio task (current thread runner), took 18.875µs
tokio blocking task (current thread runner), took 63µs
The one thing this crate does not handle, is Arc. Constructing a Backdrop<Arc<T>> would mean the special code is run every time any of the arc references goes out of scope. Constructing an Arc<Backdrop<T>> could work, but would require moving T on the stack which might be impossible if it is large (or if it is a DST). Making an Arc<Backdrop<Box<T>>> works as expected, but incurs a double pointer indirection on every normal usage.
So I'm working on an extra crate that adds a dedicated 'BackdropArc', probably a specialization of the Arc from triomphe.