Is there a way to view the C types of what is exposed when you build a cdylib?

I've been trying to compile a crate to a cdylib so I can use it via JNI, but I realized I have no way to know for sure what the C types of the exposed interfaces are, which I'd need in order to be able to pipe it up. Is there a way to view that information?

What I normally do is generate a C header file with cbindgen. It will parse your crate looking for #[no_mangle] pub extern "C" functions and figure out the corresponding type and function declarations.

2 Likes

Do you have any tips or suggestions on getting rust crates not specifically written to be a cdylib to work?
From what I gather I'd need to put [no_mangle] on every primitive that's exposed to the external application?

The nomicon has a section on FFI that might be a useful jumping off point.

Types need to be FFI safe, but C doesn't have any concept of runtime types, so there's nothing to unmangle there. If you have structs whose fields will be directly accessible via FFI you need to mark the struct as #[repr(c)].

1 Like

I don't have any experience with JNI to give Java-specific recommendations, but I'm trying to implement a plugin system based on the C ABI right now and can offer my perspective on the problem. There is no 100% safe way to load a symbol from a dynamic library if you don't know the function signature beforehand. If you know the argument types you need to call your function with only at runtime, you can in theory use runtime reflection over those arguments to figure out what your signature should be (I don't know if this is an option in JNI). But this is a dangerous route, and in practice will lead to segfaults and/or generate of garbage data very easily if your runtime type reflection system has a mismatch with the actual loaded symbol.

The risk of disaster can, however, be minimized with the use of attribute macros (see here here an example that encodes the function signature into a static variable, that can be loaded and inspected by the side that loaded the function). You can do other things with procedural macros as well, such as adding assertions of mem::size_of and/or mem::align_of to the arguments of the function, to guarantee they conform to some pre-defined interface. But those workarounds don't fundamentally solve the issue, at the end of the day you have to trust the loaded library to conform to what you program expects on the other side.

Consider an interface via serialization and/or remote procedure calls if you can. It is very much possible to load the symbols at runtime, but be mindful that it requires a lot of runtime checks such as the ones I mentioned to be minimally viable.

JNIEnv RegisterNatives function receives sinature for each callable function, sure. It's partially documented in Oracle's docs, but things like the fact that signature may include [mostly meaningless] ! prefix on Android you would need to discover by trial and error.

If there are type mismatch then sure, nothing works, but how do you expect it should work in that case?

I think the point they are trying to make is that you might look at a signature and infer it accepts a particular type, but due to seemingly trivial changes (e.g. adding a new field to a struct), the way you and the library use that type differs - which would lead to a bad time.

This isn't normally an immediate issue, instead it's something that emerges over time as code is changed.

For example, the C++ standard library is currently stuck in a bad place because there are several sub-par APIs which can't be fixed because it would be an ABI break (the "mismatch" we're referring to) and mess up programs written 20 years ago that nobody has the source code for any more.

If you are writing both ends (the program that will load the library and the library itself) then the risk is in fact minimized. But if this is not the case, and the library is written by someone else (like the user of a plugin system), the API can be described by some other means, like via an external file using a declarative language (think about a JSON saying this function has this and this signature), or even a plain C header file. Problem arises if those are written by hand and/or they need to be kept up to date with a binary that might change.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.