Mapping FFI byte string literals to Rust enums with associated &str values

Hi,

I am working on creating an FFI binding for the INDIGO Astronomy lib in C. Bindgen happily creates all of the 1243 constants used for property and item names into constant byte string literals. Here are the first few lines:

pub const CONNECTION_PROPERTY_NAME: &[u8; 11] = b"CONNECTION\0";
pub const CONNECTION_CONNECTED_ITEM_NAME: &[u8; 10] = b"CONNECTED\0";
pub const CONNECTION_DISCONNECTED_ITEM_NAME: &[u8; 13] = b"DISCONNECTED\0";
pub const INFO_PROPERTY_NAME: &[u8; 5] = b"INFO\0";
...

Currently I am using a simple function that converts these byte literals into a String:

fn const_to_string(name: &[u8]) -> String {
    // if we are calling with a faulty argument it is a bug that warrants the ensuing panic...
    let name = CStr::from_bytes_with_nul(name).unwrap();
    name.to_string_lossy().into_owned()
}

... but ideally I would like to create an PropertyName and PropertyItemName enums for my wrapper that:

  • Provides an &str reference backed by the byte string literal.
  • Maps between the value of the FFI struct fields to the corresponding enum variant.
  • Maps the enum variant to the backing byte string literal

Something along these lines:

pub enum PropertyName {
    Connection,
    Info,
    ...
}

pub enum PropertyItemName {
    ConnectionConnected,
    ConnectionDisconnected,
    ...
}

impl PropertyName {
    pub fn name() -> &str { ... }
    pub fn bytes() -> &[u8] { ... }
}

impl Into<&str> for PropertyName { ... }
impl<const N: usize> Into<&[c_char; N]> for PropertyName { ... }
impl<const N: usize> From<&[c_char; N]> for PropertyName { ... }

// Display, Debug, Eq, PartialEq, Copy, and Clone trait implementations
// Same traits implementations for PropertyItemName
...

Given the large number of constants, it would be desirable if this can be achieved automatically based on the name convention that ends each constant with either _PROPERTY_NAME or _ITEM_NAME...

I realise this is a tall order and I suspect that I would have to resort to writing macros and/or somehow extend/tweak bindgen.

Some of the enum related crates (e.g. enum-assoc, or enum_from_functions) seem to be usable for easing the pain but there still seems to be quite a bit of boilerplate to write. This Stack Overflow thread has some useful pointers, but I could still benefit from some advice in my specific case.

This is my first Rust project. I am still learning the language and ecosystem, so please bear with me if I miss something evident.

Appreciate any feedback/help on how to best approach or solve this issue!

can you show some examples of what the desired results should look like, please? these descriptions can be confusing for people having no prior knowledge, and maybe be abiguous too.

for some cases, you may be able to achieve the goal using bindgen's ParseCallbacks, but again, some examples would be helpful to give further practical suggestions.

Thanks for the reply!

I do not know exactly how the expected result looks like but I am imagining two enums along the lines of

pub enum PropertyName {
    Connection,
    Info,
    ...
}

pub enum PropertyItemName {
    ConnectionConnected,
    ConnectionDisconnected,
    ...
}

impl PropertyName {
    pub fn name() -> &str { ... }
    pub fn bytes() -> &[u8] { ... }
}

impl Into<&str> for PropertyName { ... }
impl<const N: usize> Into<&[c_char; N]> for PropertyName { ... }
impl<const N: usize> From<&[c_char; N]> for PropertyName { ... }

// Display, Debug, Eq, PartialEq, Copy, and Clone trait implementations
// Same implementations for PropertyItemName
...

I will update the question with the expected (hoped for) result.


Perhaps I should have posted two questions instead to increase the chances of finding a solution:

  • Automatically generate enums from const byte string literals generated by bindgen
  • Create an &str reference backed by a const byte string literal generated by bindgen

Should I rename this question and create a second question covering the last point?

so I did a quick search in the indigo repository. it seems these string literals are #define-es. this can be collected with bindgen's ParseCallbacks. it will probably look like this:

// build.rs
#[derive(Debug, Default)]
struct Names {
	property_names: Vec<String>,
	item_names: Vec<String>,
}

#[derive(Debug)]
struct MyCallbacks(Rc<RefCell<Names>>);

impl bindgen::callbacks::ParseCallbacks for MyCallbacks {
	fn will_parse_macro(&self, name: &str) -> bindgen::callbacks::MacroParsingBehavior {
		if name.ends_with("_PROPERTY_NAME") {
			self.0.borrow_mut().property_names.push(name.to_owned());
		} else if name.ends_with("_ITEM_NAME") {
			self.0.borrow_mut().item_names.push(name.to_owned());
		}
		bindgen::callbacks::MacroParsingBehavior::Default
	}
	fn str_macro(&self, _name: &str, _value: &[u8]) {
		// capture the string literal in this callback, if needed
	}
}

// usage
// create and install the hooks
let names = Rc::new(RefCell::new(Names::default()));
let bindings = bindgen::builder()
	.header("wrapper.h")
	.parse_callbacks(Box::new(MyCallbacks(names.clone())))
	.generate()
	.unwrap();
// write the bindings, as usual
bindings.write(Box::new(&mut output_file)).unwrap();
// generate the definition of the `PropertyName` enum from the collected property names
output_file.write(b"pub enum PropertyName {\n").unwrap();
for name in names.borrow().property_names.iter() {
	// e.g. transform `CONNECTION_PROPERTY_NAME` into `Connection`
	let variant = property_name_to_enum_variant(name);
	writeln!(output_file, "    {},", variant);
}
output_file.write(b"}\n").unwrap();
// do the same for item names
//...

the interior mutability RefCell is there due to bindgen's API limitation, and is necessary, as explained in bindgen#2124.

1 Like

it's ok you don't know the answer, that's why you ask. what I meant to say is, you should explain with at least a little bit of background and use cases, preferably with examples. you cannot just assume people know what the indigo library does, and how it works.

if you don't explain it, how could they know what a "property" is, what an "item" is, and how they are used?

when I search the indigo repository. it seems these string literals (C-style #define preprocessor macros) are essentially some kind of predefined "keys", and they are only used for equality comparison (semantically like hash table lookups, but implemented as linear searches in arrays).

I see typical usages in the code look like:

if (!strcmp(property->name, CONEECTION_PROPERT_NAME)) {
    if (indigo_get_switch(property, CONNECTION_CONNECTED_ITEM_NAME)) {
        // do something
    }
}

if I understand this correctly, it is roughtly equivalent to something like the following pseudo code:

if device.connection.connected {
    // the device is connected
}

if the library is designed in languages with advanced type system (such as rust), it might have a complete different look and feel, for example, all these different "properties" might be distinct types. however, since you are creating a binding to a C library, it would be interesting to see how the wrapper APIs will be designed (it would be challenging to make it idiomatic and efficient at the same time).

here's some examples of a possible use cases based on my limited understanding of the library.

using the origianal C API:

	indigo_change_switch_property_1(
		my_client,
		my_device,
		"CONNECTION", //<--- CONNECTION_PROPERTY_NAME
		"CONNECTED", //<--- CONNECTION_CONNECTED_ITEM_NAME
		true
	);

one possible rust wrapper API:

my_client.change_property(
    &my_device, 
    PropertyName::Connection,
    PropertyItemName::ConnectionConnected,
    true
);

this is very straight forward to implement and can be generated automatically by the build script, but, it's really NOT much different compared to calling the ffi API directly:

indigo_change_property_1(
    &my_client,
    &my_device, 
    CONNECTION_PROPERTY_NAME.as_ptr(),
    CONNECTION_CONNECTED_ITEM_NAME.as_ptr(),
    true
);

this style of "thin wrappers" will have the same problem of the underlying C APIs. for instance, you will not catch this kind of errors at compile time:

my_client.change_property_1(
    &my_device, 
    // note the property name and item name don't match each other
    PropertyName::Info,
    PropertyItemName::ConnectionConnected,
    true
);

on the other hand, it is possible to get complete different of APIs like this:

// possibly implemented with extension traits
use ConnectionPropertyExt;
if !my_client.connection(&my_device).connected() {
    my_client.connection(&my_device).set_connected(true);
}

but then it cannot be generated automatically (at least not as easily), so a log of manual work must be done. essentially, you are redesigning a different API and creating an adapter to the original C implmenetation.

so, it's all trade-offs, just like any API designs.

1 Like

so I did a quick search in the indigo repository. it seems these string literals are #define-es. this can be collected with bindgen's ParseCallbacks. it will probably look like this:

I was looking into the details of the ParseCallbacks trait but it would have taken me a long time to come up with anything close to the sample code you provided. This helps a lot - thanks!

1 Like

if you don't explain it, how could they know what a "property" is, what an "item" is, and how they are used?

True, although I thought it did not matter much as the original question was less about a good API design for this specific case and more about the more "technical" issue of mapping the const byte string literals generated by bindgen to Rust enums "en masse".

For what it is worth, INDIGO is a lib and standard that builds on and extends the INDI protocol and model used to manage astronomical devices with properties attached to a bus where each property has one or more named items that represent property values, i.e. the items are logically key/value pairs.

  • INDIGO is fully asynchronous and based on callbacks for all retrieving and manipulating device properties.
  • All devices dynamic and generic but a number interfaces represents a set of standard properties that can be either mandatory or optional for a specific interface.
  • All items of a property are of the same type: Text (String), Number (float), Switch (boolean), Light (Idle, Busy, Ok, and Alert), or Blob (bytes)
  • The INFO property is required for all drivers and contains a DEVICE_DRIVER item with a string formatted bitmap for all the interfaces implemented by a driver and a CONNECTION item that indicates if the device is connected to the bus.

The INDIGO lib is used to develop clients that manipulate the devices or drivers that provide one or more devices, or agents that combine client and driver functionality. It also contains a built in server with a local bus that provides a runtime environment for clients and drivers. Multiple runtimes that each contain a local bus can be connected to form a logical bus that spans network connections.

There are 128 drivers, 12 agents, 20 interfaces with over 1200 predefined properties and items in INDIGO.


I am currently exploring how a Rust version of the INDIGO API would look like for a client development perspective and the approach so far consists of

  • A Model trait for handling the INDIGO define, update, delete, and send asynchronous callbacks.
  • A FlatPropertyModel that implements Model and provides a property centric and tabular model for interacting with INDIGO devices.
  • A DevicePropertyModel that implements Model and provides a three level hierarchical model that groups device-property-item.

Note that when it comes to reading data in a client, the device concept is merely a grouping function, it becomes more important when manipulating device state and when writing drivers, both a second priority at the moment.

To "rustify" the INDIGO library, I am using closures for some of the client-related callbacks and for notifying changes in the property state managed by the (safe) Rust version of property and device. Note that this is partially redundant with the Model trait but I see this as two levels of the API, where the trait forms the more fundamental level.

Possibly I will hide the Model trait to shield the downstream developer from the complexity, we will see. Drawback is the departure from the original INDIGO API and not allowing customisation.


Regarding the examples (thanks!), the INFO property with the CONNECTED item is one of a few examples where it makes sense to do some kind of richer API that does not require the use of enums or constants for property and item names as the property exists for all devices.

Mapping the interfaces and properties to structs and methods would rob the API of its dynamic and generic nature. Even using enum constants in the way I was thinking when posting this question, is questionable as the fix values would also limit the the dynamic nature of INDIGO. Possibly I could do something like this though:

pub enum PropertyItemName {
    ConnectionConnected,
    ConnectionDisconnected,
    ...,
    /// Property name that is unknown at development time and used as a default for unknown properties.
    DynamicProperty(String),
}

@nerditation thanks a lot for the feedback and tips, and sorry for writing a novella...

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.