Improving the *-sys crate story

Hey all,

I've been working with a bunch of *-sys crates recently, cross-compiling for strange targets and linking with no_std, and discovering a bunch of sharp edges for c library maintainers and consumers.

I thought it might be useful to post here for visibility outside the embedded working group.
If this is something that interests you or you have experienced you may like to check out the related issue here: https://github.com/rust-embedded/wg/issues/481

Cheers,

Ryan

4 Likes

Thanks for writing up a detailed ticket @ryankurte. I've done quite a lot of work with FFI and by far the most annoying step in making a *-sys crate is getting the thing to build reliably.

I follow a process quite similar to @kornelski's Making a *-sys crate article and it works fairly well.

Pre-built bindgen outputs often do not match target architecture sizes

Possible Mitigations

  • tooling should build bindgen at compile time, with appropriate target arguments
  • guide should specify that bindgen should be run at compile time

I can understand the appeal in running bindgen at compile time, but I'd prefer to avoid it if possible. Adding a direct dependency on bindgen means anyone wanting to use your crate will need to either have libclang on their system, and given how *-sys crates tend to be very deep in the dependency graph it means it'll affect a lot of crates.

As an alternate proposal, what if you could tell bindgen to emit bindings for a selection of targets use conditional compilation to enable/disable items? It'd mean bindings files get significantly longer, but consumers now no longer need libclang and the associated hassles when building or additional compile time.

Package discovery, compilation (and linking) typically doesn't work cross platform

This is a big one, but it's a bit too interrelated to easily split.

Package discovery is another can of worms entirely. The problem is that the various platforms each have their own way of installing something or reusing libraries.

For example, in Windows you'll often bundle the libraries you need with your program because there's no central, consistent place to store DLLs. *nix platforms use a package manager which will typically store system libraries in some sort of /lib directory, setting things up so version conflicts are avoided.

About the only way you're going to get a consistent experience is if you compile from source during the build process, because then you've got access to any dynamic or static libraries you need.


Also, keep in mind there's only so much you can do from the Rust side. Unfortunately, C/C++ build systems are a massive mess and were designed before things like cross-platform and package managers became ubiquitous, so you've got an uphill battle ahead of you... cmake and git submodules make things considerably easier, but even then it's nowhere near as seamless as cargo or npm.

5 Likes

We could fix this by shipping bindgen or libclang with Rust. cc @josh

This topic may be better suited for internals.rust-lang.org?

Well there is (it's called "Common Files") but almost nobody uses it so your point stands. :wink:

libclang is insufficient for bindgen (it has woefully incomplete information). bindgen needs to get extra information directly from LLVM. Because LLVM doens't have a stable ABI, Rust would have to ship bindgen.

I'm surprised that bindgen-generated C bindings are not portable across platforms. I know the bindgen-generated tests aren't portable, but the bindings are #[repr(C)], so generally they should be fine.

If the header has the same C struct for all platforms, then the #[repr(C)] equivalent struct will automatically match it on all platforms.

If the C header managed to break portability (maybe with #ifdef), I would still not run bindgen at compile time. You can pre-generate multiple versions of the headers for various platforms and pick the right one at compile time.

it's possible that the issue is with 32-bit vs. 64-bit bindgen, but currently it would appear that struct members may be misaligned (due to pointer sizes?) when binding for different targets [1] (and also the libc vs. cty problem stands) :-/

Sadly that issue doesn't explain what actually went wrong. Bindgened pointers are portable between 32-bit and 64-bit:

struct foo {
   void *bar;
   char baz;
}

translates to:

#[repr(C)]
struct foo {
   bar: *mut c_void,
   baz: c_char,
}

which works correctly on both 32 and 64-bit targets, because both Rust and C will expand pointers and add padding the same way.

In my experience when it breaks, it's usually due to C headers using preprocessor for custom platform-specific hacks or conditional features:

struct foo {
   #if BAR_ENABLED == 1
   void *bar;
   #endif
   #ifdef __WIN32__
   char baz;
   #endif
}

Bindgen won't know how to translate preprocessor directives. For stable slowly-changing projects I add equivalent #[cfg(feature = "bar")]/#[cfg(windows)] to the bindings manually.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.