There’s been an ongoing debate in the Go world for a while about supporting “shardable” values. The fundamental observation here is that there are some data-structures that can benefit enormously from having core-local state. One example of this is a better
RwLock that keeps one reader lock for every core, and where the reader takes the lock for the core it’s running on (which should basically never be contended), and the writer takes all the locks. This significantly increases reader scalability at some cost to write lock acquisition time. Distributed counters are another example where significant multi-core performance gains can be achieved.
The original petition to the Go authors was for an easy way to tell what core a thread is running on. This information can be retrieved through the
CPUID (e.g., with
RDTSCP instructions, but some kind of indirection that allows getting it on all targets would be nice. The Go developers did not want to expose this kind of information through their runtime (see first post in the linked proposal), but instead want to expose a notion of a sharded value. I won’t go into the details of the design here (see the link), but I think Rust could benefit from having a similar kind of primitive; probably in the form of
thread::core_id() (with a corresponding
thread::max_core_id()). We could then build an external crate on top of that that provides a sharded value similar to the proposed Go one.
I’d be curious to hear whether such a primitive would be useful to others with performance-critical applications?