Best practices to design a C API that can be ergonomically used in Rust

Hi,

For those of you who have worked on safe Rust bindings to C libraries before, what aspects of a C API makes it particularly suitable to be used in Rust.
Or in other words, what to do and what to avoid when designing a C API with the express purpose to be used within Rust.

Here are the things I came up with, but I suspect there are other things that I am forgetting.

General

  • Obvious: It must be possible to build a safe abstraction.
  • Keep things simple
    • bindgen should have no issues creating Rust equivalent ffi-bindings.
  • No C preprocessor Macros
  • Support standard rust traits (where it makes sense):
    • Debug, Clone, PartialEq, Eq, Default, Hash

All types on the FFI API surface should fall into one of the two categories:

1. Resource Manager Types

This kind of type manages (allocates and frees) a resource like memory or other system resources.

  • These types should should have straightforward object lifetimes and the C API should expose the following functions:

      1. function to create -> wrapped as new method
      1. Functions to use functionality
      1. function to destroy -> wrapped as drop method
  • Should be representable as a opaque types, only work on raw pointers (which are wrapped by safe abstraction)

2. Simple Value Type

  • Only struct of value types
  • Bitwise copyable

In your experience, what other parts should be considered, that are otherwise annoying (not ergonomic) or difficult to represent in Rust?

3 Likes

Don't forget to define thread safety! That's a really annoying one to figure out on the Rust side.

Similarly, be clear about whether re-entrant access is allowed, and only use &mut if not, which is much less obvious than it might seem.

It's possible, but pretty annoying to use null terminated strings from Rust, so if you can provide a pointer, length pair where possible that's preferable.

You can easily want to wrap a C-allocated resource pointer in a Rust ref-counted type (eg Arc), which can lead to double indirection. I'm not sure of a nice way to handle this, so you might want to consider if you provide intrusive ref-counting or a custom RC type.

That's all off the top of my head!

3 Likes

If you create your own FFISlice<'a, T> that consists of a pointer, length and PhantomData you can (at the cost of a bit worse abi on windows) have many of the extern functions be safe to call if you encapsulate the unsafety in the FFISlice. While the C side uses an equivalent construct.

While my experience is mostly in creating C apis in rust, this is often the first type that I create.

This second one is probably because I like async too much:

Because callbacks are a bit annoying for me, i like to replace them with async. The library offers a function to register a singular callback for every type of callback. Then the functions take a pointer to an AsyncChannel (most (all?) of which are executor agnostic) and calls the specific completion callback.

This splits the classic data and callback pointer into 2 different calls. This allows you to then set the callback from the binding library while the call to the ffi function can be made into a safe call, by encapsulating the unsafety in the creation of the channel handle.

Memory management must be static. I've made a mistake in one C API where I've had should_free_data(bool) which was impossible to express in safe Rust.

Similarly thread safety should be static. set_thread_safety(handle, bool) doesn't allow placing Send on types.


It's nice when C APIs have a stable ABI, and functions that double-check ABI compatibility (e.g. take sizeof structs). This allows the Rust wrapper to have pre-generated FFI bindings instead of running bindgen at compile time.


Use pointer + length for strings whenever you can instead of NUL-terminated. This allows zero-cost conversion from &str.

2 Likes

Memory management must be static. I've made a mistake in one C API where I've had should_free_data(bool) which was impossible to express in safe Rust.

Can you elaborate on this? I'm not sure what the semantics of should_free_data are supposed to be.

This is C's version of Cow:

set_data(handle, char *mysterious_ownership);
set_ownership(handle, later);

and the deinit has some if (this->should_free) free(this->data).

Splitting it across two functions doesn't let you set the right lifetimes.

1 Like

C APIs that accept callbacks without state can be annoying. Rust makes heavy use of closures, and with some C APIs I find that I am frequently smuggling in my state.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.