A high performance in-memory cache

Hey community,
Does the rust ecosystem already have something equivalent to Java's caffeine? In particular, I am looking for an in-memory cache with a TTL based eviction policy that has an automatic loader in case of a miss. Preferably the loader should run in its own thread pool.
I have searched the web but found nothing like it.

There are a few caching/memoization crates. You can expect most of them to be high-performance, because that's generally an easy case for Rust.

For loading on a thread pool I don't think you need to have any special support in the cache, as long as it's thread-safe and doing per-key locking. If you use a thread-pool to perform cacheable work, it should combine nicely.

1 Like

I have to admit that I am a little upset by the dogma that most code written in Rust is magically high performance. This is not only wrong, but it is also insulting by disregarding the effort it takes to design and implement such systems. It is disrespectful to users where it is implied that any problems they encounter must be their own incompetence because obviously it is a trivially easy task. In fact Rust makes design choices that can make it harder to write such code, e.g. a garbage collector can be a major benefit when writing a concurrent data structure.

@barkanido Unfortunately, of those crates only waitcache offers a good starting point. It correctly handles cache stampedes by combining DashMap with WaitCell to synchronously compute the value after the mapping was established. If not done so, e.g. as cached warns it does not, then this can cause outages as concurrent requests overload the backing resource on a cache miss. This is known as dog piling and has caused many large outages at scale. In your case, you could replace WaitCell with a Future to compute the value asynchronously.

For expiration, consider using tokio-timer for an O(1) hierarchical timer wheel. This implementation borrows tricks from Caffeine's, which is hierarchical with a fixed power-of-two sizing. Previously tokio used a hashed wheel with a configurable sizing, and was dismissive of these optimizations when discussing bringing them over. Shortly after, though, the author reconsidered and wrote an excellent implementation with a very nice API. This approach is elegant by offering fast, prompt expiration without degrading as the cache size increases. Most caches use a heap-like priority queue or rely on size eviction to lazily discard expired entries, which can significantly reduce the hit rate due to cache pollution.

Unfortunately there is no existing crate yet that brings everything together in a nice package. But I think the community does offer good primitives and libraries to draw from to write one yourself.

1 Like

I don't think it is fair to read that much into @kornel 's post – he did not assert that everything written in Rust is "magically" high-performance, only that it's generally a good fit for the use cases Rust was designed for. There's nothing insulting in that.

1 Like

I'm not sure there is such a dogma.

If I were to write an in-memory cache in Rust it would likely be as performant as if I'd written it in C, C++ or whatever. I'm pretty certain it would perform a lot better than if I wrote it in Python, Javascript and the like.

Of course my in-memory cache, in whatever language, would almost certainly perform terribly compared to a best in class implementation written by someone who knows all the issues, solutions and optimizations. As your post makes clear.


But I think the community does offer good primitives and libraries to draw from to write one yourself.

Certainly, I have excellent experience with Caffeine, and I consider it (or its semantics and ideas at least) as a reference implementation. Although cache eviction is a much harder problem and is currently out of my scope, solving it at the next phase might produce a useful library. Maybe I'll call it "Dopamin" :smiley:

In fact, Rust makes design choices that can make it harder to write such code.

I also agree here. These kind of data structures are inherently mutually accessed, and this takes quite some work to convince the compiler that you know what you are doing.

Thanks for the advice. I'll definitely look into this code.

1 Like