I am writing toy ECS engine and want to prevent false CPU cache sharing for my components during parallelized iteration. I use chunks with size 64 to make sure that different chunks are in different cache lines.
However, this would not work if buffer's beginning isn't aligned to the cache line boundary so I want to make every allocation and reallocation of vecs to be aligned at max(align_of::<T>(), 64). I currently thinking about making decorator over global allocator and use vectors that generic over that allocator.
Is there some other way, e.g. some way which would be stabilized sooner?
I would expect paralellization to have benefits when you have lots of instances, but then couldn't you use bigger chunks so that different thread always access really far instances?
That said, if you just want an overaligned Vec you could just make your own.
It's pretty annoying to make an overaligned vector so I'm curious if there is a shortcut. Vec has a lot of trait implementations and methods. There are other applications beyond the one @AngelicosPhosphoros mentioned: e.g. better performance for vectorized code, if you want to use assembly with aligned loads, avoiding page faults in code that reads past the end of an array (I know this is illegal in Rust but it is fine in assembly that I might call from a Rust program), potential FFI requirements, etc.
If you're ok with a best-effort solution instead of a guaranteed one, you could try using Vec::with_capacity to force a large allocation; I suspect that most allocators will page-align sufficiently large requests.
Alternatively, if you're okay with less established libraries, you can use the aligned-vec crate. I discovered it after writing a similar library myself and the code looks solid, definitely usable for a toy project.