I want to implement a sort of thread local, global allocator (if that makes sense)
Where the index is the thread id and 99% of the time the same thread will access the same allocator. The only time when I need to lock is when I want to free some memory and I only do that in bulk.
I wasn’t sure how fast/slow mutexes are if the same thread accesses them over and over, so I made a small benchmark. I was hoping that the overhead would be minimal.
Summing mutexes of integers vs summing options of integers is roughly 32x times slower. Honestly the overhead is probably negligible in my case as I don’t allocate that often but I am still wondering if I can do better, especially because I have thought of this pattern many times before.
Is there mabye another abstraction that would fit better than a mutex?
I am also not sure if a mutex is ever the right choice. Can’t you implement a mutex purely with atomics(CAS)?