How to write an idiomatic C interface for my crate?


#1

I am writing a C API for my file syncing crate ubiquity, in order to use it in Swift. I decided to use Rusty Cheddar to generate the C headers but I am struggling on how to best represent the ‘semantics’ of ownership etc in C.

Returning non repr© structs

#[no_mangle]
pub unsafe extern fn ubiquity_search_directories_from_root() -> ???

In the ideal world I would like to return the SearchDirectories struct on the stack, thus avoiding allocation, but I understand that because it is not repr(C) that wouldn’t work.
As a workaround I have been allocating it on the heap, and then writing Box::from_raw(Box::new(search_directories).
However, rusty cheddar doesn’t like the return type *const ubiquity::detect::SearchDirectories because it comes from another crate.

Now i seem to have two options:
Create a direct wrapper, but then I have to manually wrap/unwrap when using it or gamble with std::mem::transmute

#[repr(C)]
pub struct CSearchDirectories(SearchDirectories);

I was using raw pointers inside the struct but now I think that is a bad way of doing it.

#[repr(C)]
pub struct CSearchDirectories(*const SearchDirectories);
#[repr(C)]
pub struct CSearchDirectoriesMut(*mut SearchDirectories);

Enums with extra data

This is sort of similar to the first question, but I am more determined to pass it on the stack. This enum should be representable in C using an inner enum for the discriminant and then a size_t value that will just be uninitialized if the discriminant is not PropagateFromMaster

pub enum Operation {
    /// the provided replica was correct
   PropagateFromMaster(usize),
   /// the item was changed on multiple replicas and we don't know which
   ItemChangedOnMultipleReplicas,
   /// the item differs, but there was no previous state in the archives so we don't know which replica is 'correct'
  ItemDiffersBetweenReplicasAndNoArchive
}

### Turning generics back into dynamic dispatch

A month ago I rewrote the library using `GenericArray` instead of `Vec`, avoiding heap allocation for what is 99% of the time only a 2 element long vector.

The issue is, I can't have generics in a C interface.

My plan so far is:
1. 'Monomorphize' the C interface. ie: I write separate functions (`ubiquity_propagate_2`, `ubiquity_propagate_3`, `ubiquity_propagate_4`) for each number of replicas. I think there would never be more than 4 replicas.

2. In Swift, write a wrapper struct/class that contains a void pointer to "some" Rust struct, along with a number which represents the number of replicas, then switch on that and convert the void pointer to the monomorphized version and then invoke that.

I think it would work, but is there a cleaner solution?

### Conclusion

Basically, I think I could hack it together alone, but I am searching for the cleanest, most idiomatic solution!

#2

My personal approach to this problem is to try and find the idiomatic C solution and implement that. With that in mind…

The key issue with permitting your struct to be on the C stack is that a C compiler must know the size of said type, which means you must expose the internals of that type in your C header file. This might be OK, but if you make any changes to that struct you risk making an ABI incompatible change.

The typical way around this is to declare an opaque struct (i.e., an abstract type). For all intents and purposes, this will require a heap allocation.

A common C idiom is to put the return value of a function into the parameter list of the function. So for example, your search function might instead look like (assuming SearchDirectories is an opaque struct):

#[no_mangle]
pub unsafe extern fn ubiquity_search_directories_from_root(dirs: *mut SearchDirectories) { ... }

The caller must then pass an already allocated dirs value. Your API would need to provide a way to build an “empty” such value:

#[no_mangle]
pub unsafe extern fn ubiquity_search_directories_new() -> *mut SearchDirectories { ... }

Now the C caller can control allocation and amortize it if necessary at the cost of a slightly more cumbersome API. This of course assumes that:

  1. Making SearchDirectories opaque is valuable (probably it is).
  2. Whether amortizing allocation even makes sense for the specific problem being solved. It feels like this is true in your case, but I can’t be sure without more details.

If you need to do such things to play nice with Rusty Cheddar, then a simple wrapper seems preferable. If you’re writing a C API, then you’ll need to do a dance at the beginning of most functions where you convert raw pointers to more convenient Rust types any way, so the extra unwrap/wrap probably doesn’t matter too much.

I actually don’t know the answer to this question. It seems like the C union RFC (tracking issue: https://github.com/rust-lang/rust/issues/32836) will be your answer, but certainly, it isn’t wise to make your Operation enum repr(C) as is because of the tuple variant.

For now, you may need to define this type as a tagged union in C. It feels like you should be able to make this work with the right magic in your build.rs file.

Seems reasonable?