Alignment of Vec data


#1

I’ve recently been playing with structs whose layout isn’t known until runtime, GLSL uniform blocks to be exact. When using them, the OpenGL driver decides which layout is performant/possible for the hardware, and the API has means for querying field names and the associated field offsets and other related info.

So if I want to have a Vec of these dynamic structs, I can’t just have Vec<SomeDynamicType>. A Vec<u8> as storage buffer would seem nice base to build on, because offset calculations would be easy to do, but I’m not sure if the vectors contents are “safely” aligned. I dug into the Vec allocation code a bit, and it seemed like it calls (at least when using jemalloc) an allocation function that guarantees the returned pointer is aligned to the requested alignment but nothing else (which in case of Vec is derived from the type parameter, so alignment of 1 in case of u8).

Should I maybe then use Vec<u32> or Vec<u64> or perhaps Vec<usize> to make sure the buffer gets allocated with a safe alignment? (I’m assuming here that the field offsets returned by OpenGL are properly aligned.) Or maybe skipping Vec altogether in this case would be better?

Edit: I was sloppy and only now noticed that type parameters within <X> aren’t shown properly unless the text is within backticks!


#2

The usual trick for forcing a higher alignment doesn’t work in the case of Vec as the higher alignment also forces the size to be at least the alignment: https://is.gd/xqV8Eh (It’s also not dynamic, I just realized that was a requirement.)

An RFC was merged for adding an alignment attribute for structs but there hasn’t been any movement as yet. I don’t know why. (Also not dynamic, but still good to note.)

When custom allocators are implmented and support is added to Vec you can force the alignment that way, but that’s currently not possible.

So, we fall back to the oldest trick in the book for getting an aligned pointer: overallocate, and find a starting point in the allocation which meets our alignment requirement.

I started implementing something like this in this abandoned experiment to create a cache-aligned double-buffer (which I may or may not return to someday), but it’s a little more complicated because I try to use one allocation for two buffers; the cache-alignment is to eliminate false-sharing so the two can be accessed concurrently with no thrashing.

I do the allocation manually but you can (with relative ease) create a wrapper for Vec which will give you a byte-slice aligned to whatever boundary you desire. I won’t implement it for you, but as a guide, you can use the implementation of the system allocator for Windows (for Rust), which does this manual alignment as well. Here’s a breakdown of the steps:

This implementation stores the original pointer in the unused portion of the allocation. You don’t need to do that as the Vec will keep track of that pointer. Since you’re only looking for byte vectors, you don’t have to worry about Vec trying to drop values in those unused bytes.

If you’re okay with nightly, you can use the alloc::heap API which will do this alignment for you, then you can just construct your vec with Vec::from_raw_parts(). Since deallocation goes through this API, you can just let Vec handle the rest. However, reallocating needs to be done manually as Vec won’t provide the proper alignment when calling alloc::heap::reallocate(). If you never reallocate, you don’t have to worry about this.


#3

After testing what align_of returns for different kinds of structs, I finally realized that the struct alignment isn’t necessarily about the struct as a whole. Instead it looks like the struct’s alignment is the alignment of the biggest primitive data type within the struct. In practice (on x86_64) the possible values seem to be 1, 2, 4 or 8.

Now if I’m understanding things correctly, the struct alignment basically comes down to that any (primitive) field must have an address that is an exact multiple of its size? (Though I guess that 64-bit variables on 32-bit architecture may be handled in some special way.)

If that’s the case, maybe I should then just go with Vec<u64>, because double precision floats are the biggest supported type in GLSL (when looking at the single-element basic types, excluding vectors and matrices). Address math will be annoying, but maybe some unsafe magic can be used to access it as [u8].

I don’t mind that the buffer may be a little larger than the memory actually used by the struct, because there will be less than a single u64 wasted in any case. It’s kind of like the extra capacity Vec may in some cases have anyway. Arrays of these dynamic structures are not a problem either due to the way GLSL handles arrays - they are exposed as individual variables that may even be backed by entirely different buffers. (This actually only applies to top-level variables, arrays within structs are proper arrays, but those are handled within a single struct instance, so any padding at the end of it still doesn’t cause problems.)


#4

Given that jemalloc uses power-of-two buckets, if jemalloc is in use, is it safe to expect that the alignment of Vec's buffer is at least N if the vector length is equal to or greater than N?


Aligning a u8 array to 16 bytes