Announcing rcu-clean


#1

I’d like to announce rcu-clean which is a crate that provides smart pointers that use read-copy-update to allow modification of the data while it is still borrowed. rcu-clean provides RCU versions of the standard Box, Rc, and Arc pointer types. These pointers (RcRcu, BoxRcu, and ArcRcu) should behave identically to their std peers for read access, but allow interior mutability without an embedded RefCell, Mutex or RwLock. Read access is faster than using an embedded Mutex or RwLock, and reads can overlap with writes.

The “clean” in the name comes because there is a clean() method that must be called in order to free up older versions of the data. Or you could never call clean, and just leak a bit of memory if you have data that is only written to once.

BTW the other nice feature of rcu-clean (compared with std interior mutability) is that there is no need to borrow(), lock() or read() (why must there be three different methods for the same task?), but instead you can just use standard Deref coercion to read your data (as with a plain Box, etc.).


#2

One hazard with implicit Deref is each deref may see different data, right? This is a point in favor of an explicit read() method, as then it’s clear about what points you’re borrowing the data, and that each time may see a new update.


#3

Yes, that is a risk. If you need to have a coherent view of an object over several uses, you would do best to borrow a reference to it (or implement a larger method that takes &self). And if that is common for a given object then probably rcu-clean may not be not the pointer for you.


#4

I haven’t read through the code properly yet. But do I get that with ArcRcu I should somehow know when I can already call .clean(), because if I call it too soon, the old value will not get freed yet, but the method won’t tell me if it cleaned or not?


#5

It is always safe to call .clean(), but not always helpful. You need to call .clean() on all copies of the pointer in order for old data to actually be freed. I imagine this may be tedious. I expect that for simple data structures (e.g. one pointer per thread) you could simply always call clean in some slow path.

To be honest, I’m not sure how these pointers will work out in practice. In my one project that currently uses them, I’m simply leaking one copy of each data structure, since I only mutate once and don’t bother calling clean. The major benefit to me is that I can avoid calling .borrow() tons of times in template code.