std::alloc::System::alloc returns misaligned pointers?

Posting on here before IRLO, since I may be misunderstanding something.

It seems like when a system allocator is being used which is not jemalloc, std::alloc::System::alloc does not necessarily return 16-aligned pointers on Unix x86_64 when requested by the caller. This would be because the implementation in std::sys::Unix::alloc seems to assume malloc will return pointers aligned at least to std::sys_common::alloc::MIN_ALIGN (which is 16 on x86_64), with some rationale based on jemalloc. In general, unless I'm mistaken, malloc on x86_64 can return 8-aligned pointers. In this playground you will notice in the assembly that malloc is incorrectly being used to allocate a 16-aligned pointer instead of memalign.

I'm not sure about the other architectures, but they probably suffer the same bug as well. I'm also not sure if there's some reason malloc alignment would definitely be at least 16 on unix-like, but I couldn't find anything.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/malloc.html

The pointer returned if the allocation succeeds shall be suitably aligned so that it may be assigned to a pointer to any type of object

And that can be determined by max_align_t, which is where we get the guarantee of 16-byte alignment on x86_64.

I don't see any guarantee in there that long double is 16 bytes. The C standard allows long double to be equivalent to double, but I'm not sure if I'm just getting into language lawyering territory.

The C standard doesn't guarantee a specific max_align_t, but this is set by the target ABI.

But we are talking about the output of malloc, which may be compiled separately from libstd.. and MIN_ALIGN is only defined based on the arch, not whether it's linux or some other ABI:

#[cfg(all(any(target_arch = "x86_64",
              target_arch = "aarch64",
              target_arch = "mips64",
              target_arch = "s390x",
              target_arch = "sparc64")))]
pub const MIN_ALIGN: usize = 16;

See, for example, uclibc, who's malloc returns the alignment:

/* The alignment we guarantee for malloc return values.  We prefer this
   to be at least sizeof (size_t) bytes because (a) we have to allocate
   that many bytes for the header anyway and (b) guaranteeing word
   alignment can be a significant win on targets like m68k and Coldfire,
   where __alignof__(double) == 2.  */
#define MALLOC_ALIGNMENT \
  (__alignof__ (double) > sizeof (size_t) ? __alignof__ (double) : sizeof (size_t))

[1] https://git.busybox.net/uClibc/tree/libc/stdlib/malloc/malloc.h#n14

IMO uclibc is wrong here. The x86-64 psABI specifies that long double has 16-byte alignment, although I also see in uclibc's docs that "long double support is quite limited."

Even more directly:

Awesome, thanks for that link, that's probably deeper than I'd be able to dig easily. In that case, it seems we're actually stepping into the opposite of spec lawyering territory, i.e. "specification versus what implementations do". IMO it seems wrong and a ticking time bomb to me (that will blow up whenever Rust adds support for more targets) that rust libstd hardcodes MIN_ALIGN as 16. Do you disagree?

I agree that new targets need to consider MIN_ALIGN, yes. Thankfully those definitions are only positive cfgs on target_arch, without any catch-all, so a new architecture would just be missing altogether. I don't know if different OS ABIs might also disagree here, so that could be a footgun.