Aligning a u8 array to 16 bytes

I'd like to create arrays of u8 a) on stack and b) statically using include_bytes! such that the arrays are aligned to 16 bytes (in order to have nice alignment for processing as u8x16 with explicit usage of aligned loads via intrinsics).

Is there a way to make the compiler guarantee such alignment?

If not, am I correct to assume that Vec<u8> whose length is equal or greater than 16 has its buffer aligned to at least 16 when jemalloc is in use? (I.e. can I get the required alignment by using the heap instead?)

From what I understand, Vec<T> uses align_of<T> when (re)allocating memory so the alignment of Vec<u8> would be mem::align_of::<u8>(), which is 1. So if you want a guaranteed alignment multiple of 16, you'll need to allocate a Vec<T> where mem::align_of::<T>() is 16.
I think you could use Vec::with_capacity(), then as_mut_ptr(), mem::forget() and from_raw_parts() to get a Vec<u8> allocated with a 16 alignment multiple and only use &mut [u8] to avoid reallocations of the buffer, but that's really unsafe.
You could also create your own container with aligned buffers, using RawVec<T> with T 16 aligned and a wrapper to allow safe borrowing as &[u8] and &mut [u8].
The penalty for unaligned loads doesn't seem very big though so it seems much safer to just use an unaligned Vec<u8>.

I've seen people use a hidden _alignment: [T; 0] field to guarantee T-alignment of structs. Something like this:

use std::mem;

struct _16BitAlignedU8Array {
    x: [u8; 3],
    _alignment: [u16; 0],

fn main() {
    println!("{}", mem::size_of::<_16BitAlignedU8Array>());
    println!("{}", mem::align_of::<_16BitAlignedU8Array>());


$ ./foo

Would that work in your case?

Yes, but when the underlying allocator is jemalloc, which, AFAIK, allocates memory in power-of-two chunks, can the actual alignment of the buffer be less than 16 if the length of the is equal to or greater than 16?

Thank you, but I'm looking for alignment to 16 bytes. It seems that u128 doesn't exist and I don't want the types to be nightly-only, so u8x16 doesn't work. (When using SIMD optimizations, nightly Rust needs to be used, but I'd rather not make callers use nightly-only types to allow compilation on stable without SIMD.)

That's right, see this source link, you can see the conditionals there. Also, jemalloc is not being used everywhere and Rust may even move entirely off it.

Not that it helps in the short term, but there is an RFC to add alignment support to Rust. Recently someone enquired about working on it, so perhaps it will appear sometime in the future.

Thank you. That should be good enough for Rust standalone programs on Linux x86_64 and Rust-in-Firefox on the platforms considered tier-1 for Firefox.

I can wait for Rust to develop explicit alignment control for other environments until later.

Thanks. Unfortunately, that seems to require a custom struct instead of allowing the programmer to request an array of u8 to start at a particular aligment on stack.

Might be a dirty hack and results in a slice rather than an array: Allocate an extra of 15 bytes and fetch the raw pointer of the first element. Then compute the amount the pointer has to be increased to be aligned correctly. Use the amount as offset for sub-slicing and you'll receive a correctly aligned slice. (I don't know if this can be coerced into an array somehow).

I think this is ugly as hell and uses unnecessary memory, but it should compile to a near-NOOP and uses only safe code (AFAIK).

I'd like to see an alignment-attribute as well. Maybe you want to dive into "vulkano" - it interacts with a GPU and as such has to ensure correct memory alignment for allocation. Maybe tomaka found a solution for this.

1 Like