I have a struct which has a method .get(&self) which does some expensive computation to compute and return the output, but that output will not change on later re-invocations of this method. So ideally I want to cache the output and just read from the cache next time.
In order to cache the result in the struct I would need a mutable reference, i.e. fn get(&mut self) -> ..., but lets assume that is not an option for this use case.
An alternative is to use RwLock and Once inside the struct. That is doable but incurs all the RwLock overhead.
The other alternative is to use unsafe code below; I do not know if Once::call_once will make this code safe or not. Is this unsafe code safe? if not why not?
use std::sync::Once;
struct Foo {
token: String,
cache: String,
once: Once,
}
impl Foo {
fn new(token: String) -> Self {
Self {
token,
cache: String::default(),
once: Once::new(),
}
}
fn get(&self) -> &String {
self.once.call_once(|| {
// Do the expensive computation.
let token = self.token.clone() + &self.token;
// Cache the result.
unsafe {
let cache = &self.cache as *const String;
let cache = cache as *mut String;
*cache = token;
};
});
&self.cache
}
}
@Cerber-Ursi is there any alternative aside from using RwLock?
Also, seems like RwLock or Mutex implement this exact same functionality (i.e. immutable reference to mutable reference); How do they achieve this?
If you have a reference &T, then normally in Rust the compiler performs optimizations based on the knowledge that &T points to immutable data. Mutating that data, for example through an alias or by transmuting an &T into an &mut T, is considered undefined behavior. UnsafeCell<T> opts-out of the immutability guarantee for &T: a shared reference &UnsafeCell<T> may point to data that is being mutated. This is called “interior mutability”.
All other types that allow internal mutability, such as Cell<T> and RefCell<T>, internally use UnsafeCell to wrap their data.
UnsafeCell is unsafe to use anyway, the main reason why it isn’t Sync is so that when using it, you don’t accidentally mark a type as Sync when it shouldn’t be, similar to why raw pointers aren’t Sync either; if you ensure synchronization yourself (as e.g. in your example with the Once), you’re totally allowed to (unsafely) implement Sync for a type that’s using UnsafeCell internally; in fact that’s what things like Mutex do, too.
In fact, the existing (unstable) SyncOnceCell does exactly that: looking at its source you can see that it’s essentially just
@steffahnonce_cell::sync::OnceCell does the job but does it for sure have less overhead than a RwLock and std::sync::Once? The api looks like it has a lot more functionality which makes me worried about the overhead.
Also, no alternatives just using std which does not require a 3rd party crate?
once_cell is written by Rust team members and has been approved to be added to std. I appreciate concerns about depending on too many third-party crates, but I think this one should not be a problem.
If you want a nice read on the topic of caching values using only &self, last week's This Week in Rust featured this walkthrough that seems to address exactly your use case: Caches in Rust by @matklad
Practically, just use once_cell crate -- despite providing comprehensive API, it's rather lean at compile time and doesn't incur any extra runtime overhead. If you'd rather stick to std, the best thing is probably to work on bringing once_cell there!
If you want to do it yourself, the following minimal impl should work (absence of bugs is not guaranteed, it also a bit needlessly conservative wrt dropcheck):
use std::{cell::UnsafeCell, mem::MaybeUninit};
struct OnceCell<T> {
once: std::sync::Once,
cell: UnsafeCell<MaybeUninit<T>>,
}
impl<T> Default for OnceCell<T> {
fn default() -> Self {
Self {
once: std::sync::Once::new(),
cell: UnsafeCell::new(MaybeUninit::uninit()),
}
}
}
// NB: this does need `Send` bound.
unsafe impl<T: Sync + Send> Sync for OnceCell<T> {}
unsafe impl<T: Send> Send for OnceCell<T> {}
impl<T> OnceCell<T> {
pub fn get_or_init(&self, init: impl FnOnce() -> T) -> &T {
// Tricky: once is guaranteed to deadlock or panic on re-entrancy.
self.once.call_once(|| {
let val = init();
unsafe {
(*self.cell.get()).write(val);
}
});
unsafe { (*self.cell.get()).assume_init_ref() }
}
}
impl<T> Drop for OnceCell<T> {
fn drop(&mut self) {
if self.once.is_completed() {
unsafe { (*self.cell.get()).assume_init_drop() }
}
}
}