On Apple Silicon, bindgen union is constructed with random data

I'm working with PROJ, a coordinate transformation library in C. Specifically, I'm working with proj_coord, a function that constructions a coordinate represented as a union (source code).

Here's an example of usage in a C program:

#include <proj.h>
#include <stdio.h>

int main() {
  PJ_COORD coord = proj_coord(1., 2., 3., 4.);
  printf("%f %f %f %f", coord.v[0], coord.v[1], coord.v[2], coord.v[3]);
}
$ clang -lproj projcoord.c
$ ./a.out
1.000000 2.000000 3.000000 4.000000

Note that the compilation and execution above was done on my Macbook Air M1 ( aarch64-apple-darwin).

In the GeoRust ecosystem, there is a proj-sys crate which is a bindgen-powered wrapper around the PROJ C library. And as expected, there is a function proj_sys::proj_coord which maps to the C function above.

Here is (in theory) a Rust program equivalent to the previous C program:

fn main() {
    unsafe {
        let coord = proj_sys::proj_coord(1., 2., 3., 4.);
        println!("{:#?}", coord.v);
    }
}

This is where things get interesting! On x86_64-apple-darwin, the code above correctly prints out [1.0, 2.0, 3.0, 4.0]. On my Macbook Air M1, I get random floating point values upon every invocation:

You can try this out yourself by checking out and running cargo run in this repository.

For reference, this is the Rust code that bindgen generates:

extern "C" {
    pub fn proj_coord(x: f64, y: f64, z: f64, t: f64) -> PJ_COORD;
}

#[repr(C)]
#[derive(Copy, Clone)]
pub union PJ_COORD {
    pub v: [f64; 4usize],
    pub xyzt: PJ_XYZT,
    pub uvwt: PJ_UVWT,
    pub lpzt: PJ_LPZT,
    pub geod: PJ_GEOD,
    pub opk: PJ_OPK,
    pub enu: PJ_ENU,
    pub xyz: PJ_XYZ,
    pub uvw: PJ_UVW,
    pub lpz: PJ_LPZ,
    pub xy: PJ_XY,
    pub uv: PJ_UV,
    pub lp: PJ_LP,
    _bindgen_union_align: [u64; 4usize],
}

What could be going wrong here? Is something wrong with the proj-sys bindgen setup? Or maybe a problem with bindgen itself?

Here's the associated bug in the proj repository. Reaching out here to cast a wider net. Thanks!

2 Likes

It could be a bug in Rust's C ABI for that kind of return value. Can you compare the assembly of your callers from C and from Rust?

2 Likes

I just confirmed that removing _bindgen_union_align: [u64; 4usize], from the bindgen-generated Rust union fixes the issue. Seems like I should file a bug on the bindgen issue tracker?

4 Likes

Maybe that align entry changes the aggregate classification for ABI.

Root cause has been determined and will be posted in `_bindgen_union_align` field in generated Rust union causes misalignment on Apple Silicon, and arbitrary data is populated · Issue #1973 · rust-lang/rust-bindgen · GitHub in a bit.

3 Likes

Description of the root cause can be found here: `_bindgen_union_align` field in generated Rust union causes arbitrary data to be populated on Apple Silicon · Issue #1973 · rust-lang/rust-bindgen · GitHub

2 Likes

That actually raises the issue: where/what is the bug? What's responsible for this? Surely bindgen has to ensure proper alignment somehow, but doing that with a type that changes the calling convention is also wrong. How does bindgen determine if (and ensure that) the calling convention remains intact by the transformations it applies? Is that possible at all in the general case?

I wonder why they don't just use repr(align)... maybe because the exact alignment is target-specific?

I think the only sure way is to stick to exactly equivalence, without additions.

1 Like

I think we should move discussion to the GitHub Issue so it's all in one place

Can you use repr(align) and repr(C) at the same time?

Yes, even in the same breath, e.g. #[repr(C, align(8))]

2 Likes