What is the global allocator story in multi threaded rust?

On x86_64 linux latest stable Rust when having multi threads:

  1. does each thread have it's own global allocator ?
  2. is there one shared global allocator ?

Context: consider an application with lots of small Box<T>, so lots of allocating / deallocating. On a single thread, it the allocator might not be a problem, but if there is 50+ threads, all locking / unlocking the same allocator, the global allocator itself might become bottleneck.

(Modern) Allocators surely have a lot of complexity in then to make them performant and I certainly did not spend enough time studying them so take the following with a grain of salt:

I don't think an allocator that would be truly thread local makes a lot of sense as communication between threads would be limited to non heap allocated data.
Imagine if thread A handed some Box<T> to thread B, how should thread B free the memory if it is managed by allocator of thread A?
Surely there could be some allocator implementation that somehow manages multiple (mostly?) separate thread local allocators, but I think they would at least know implementation details of each other (so no mixing of allocators on different threads)

Well, allocations are expected to be expensive, so you should usually thrive to reduce the number of (de)allocations, instead of worrying about whether locking is fast enough deep inside the allocator.

2 Likes

No, because then it wouldn't be global. And Box<T> is Send (assuming T is Send) so global has to be actually global.

I suggest reading up on how modern allocators work. Here's one that I've seen mentioned a bunch: GitHub - microsoft/mimalloc: mimalloc is a compact general purpose allocator with excellent performance. (which you can use via MiMalloc — Rust memory management library // Lib.rs).

From it's readme:

Not only do we shard the free list per mimalloc page, but for each page we have multiple free lists. In particular, there is one list for thread-local free operations, and another one for concurrent free operations. Free-ing from another thread can now be a single CAS without needing sophisticated coordination between threads.

4 Likes