Rust/OCaml bindings and Traits

Hello Rust experts!

After hand-rolling some Rust-OCaml bindings I started to think about how could I possibly automate this process (if not completely, at least partially by providing some not-really-working boilerplate that a human could fix).

I'll be honest, I'm not entirely sure this is at all possible :smiley: But right now I'm thinking about traits and how can I work with them in generated bindings.

Some background on how Rust objects are exposed to OCaml. In most of the cases they are exposed as opaque objects which have associated functions to work with them. OCaml FFI interface is quite alike with Python one, there are C API functions that manipulate the state of managed objects (interact with GC, etc). From Rust side the FFI interface is a bunch of foreign "C" functions that OCaml calls.

So opaque value is created from Rust side like this (code from ocaml-interop):

/// Allocate a `DynBox` for a value of type `A`.
pub fn alloc_box<A: 'static>(cr: &mut OCamlRuntime, data: A) -> OCaml<DynBox<A>> {
    let oval;
    // A fatter Box, points to data then to vtable
    type B = Pin<Box<dyn Any>>;
    unsafe {
        oval = ocaml_sys::caml_alloc_custom(&BOX_OPS_DYN_DROP, mem::size_of::<B>(), 0, 1);
        let box_ptr = ocaml_sys::field(oval, 1) as *mut B;
        std::ptr::write(box_ptr, Box::pin(data));
    }
    unsafe { OCaml::new(cr, oval) }
}

DynBox<T> is defined as follows:

/// [`OCaml`]`<OCamlFloatArray<T>>` is a reference to an OCaml `floatarray`
/// which is an array containing `float`s in an unboxed form.
pub struct OCamlFloatArray {}

/// `OCaml<DynBox<T>>` is for passing a value of type `T` to OCaml
///
/// To box a Rust value, use [`OCaml::box_value`][crate::OCaml::box_value].
///
/// **Experimental**
pub struct DynBox<A> {
    _marker: PhantomData<A>,
}

Important aspect of OCaml GC is that it's a moving GC, and that custom allocated block (via ocaml_sys::caml_alloc_custom) can actually be moved to another memory location during GC phase. That Pin<Box<dyn Any>>; is trying to encode that assumption into Rust (based on my understanding).

This works great, you export your opaque type to OCaml, along with some C functions that can recover it back from OCaml and call some Rust methods on it.

Now back to the traits problem... When your Rust type implements a trait, it gains some methods. Right now I just write corresponding wrappers for those methods for that type, but overall it would be nice to get rid of that boilerplate by defining trait binding code in one place, and apply it to all objects which implement that trait. Another part of the problem is that sometimes Rust API expect "anything satisfying a trait", and I'm a bit lost on how to encode that into Rust/OCaml interop.

That dyn Any inside the pinned box is nice, it allows to catch cases when other object is passed from OCaml side, it panics with somewhat actionable error. But it does not work when I have DynBox<A> and want to pass it to somewhere where DynBox<dyn MyTrait> is expected. Based on my internet search results, Rust does not have enough information at runtime to figure out if dyn Any implements dyn MyTrait, it only has type id which can guarantee safe downcast back from Any to concrete type.

Ideally I want to write a bunch of binding C functions for trait methods, and then have it be applicable for all type instances which implement that trait.
Is it possible to have some registry of type id => dyn trait convertor functions? Probably it could be safely populated with some macro by hand, and in my trait-binding functions I just pass my dyn Any and dyn Trait to some lookup function, which will get corresponding converter function, idexed by type ids of corresponding type behind dyn Any and id of dyn Trait? If dyn Trait does not really have type id, probably stringified fully-qualified name of Trait should be good enough.

Any pointers are greatly appreciated.

1 Like

A Pin is a claim that the referent (dyn Any) of the pointer (Box) won't be moved. It is only useful if there is some function in your program that accepts Pin<Box<dyn Any>>, or Pin<&mut dyn Any> or Pin<&dyn Any>, and that is unlikely. Most Rust types don't need to be pinned, and the big case where pinning is important is Future, but you can't downcast from dyn Any to dyn Future, so you can't make use of that.

Most likely, you get no benefit from involving Pin, and it has nothing to do with making things safe for the GC to move (that's the default assumption).

It's true that the “if” information doesn't exist, but the more important part is that the vtable doesn't exist. Vtables are only created when needed, so the only way a vtable for some concrete type Foo implementing MyTrait exists is if, somewhere in the program, there is a coercion from Foo to dyn MyTrait — that specific combination of type and trait. So, you would indeed need a “registry of type id => dyn trait convertor functions”, because that collection of functions is precisely what instructs the compiler to create the necessary vtables at all. But in order to satisfy types on the Rust side, you'll need a separate registry for each dyn Trait type you want to be able to produce, or to double-box in another dyn Any to fit all cases:

use std::any::{Any, TypeId};
use std::collections::HashMap;
use std::fmt::Display;

type CoercionInAny = fn(Box<dyn Any>) -> Box<dyn Any>;

#[derive(Default)]
struct Registry(HashMap<(TypeId, TypeId), CoercionInAny>);
impl Registry {
    pub fn new() -> Self {
        Self::default()
    }
    
    pub fn register<In: ?Sized + 'static, Out: ?Sized + 'static>(&mut self, f: CoercionInAny) {
        self.0.insert((TypeId::of::<In>(), TypeId::of::<Out>()), f);
    }
    
    pub fn coerce<Out: 'static>(&self, input: Box<dyn Any>) -> Out {
        let f = self.0.get(&((*input).type_id(), TypeId::of::<Out>())).expect("unavailable");
        *f(input).downcast().expect("coercion fn returned wrong type")
    }
}

// something must generate this code for each *trait* desired
fn into_dyn_display<T: std::fmt::Display + 'static>(boxed_t: Box<dyn Any>) -> Box<dyn Any> {
    let dynified: Box<dyn std::fmt::Display> = boxed_t.downcast::<T>().unwrap();
    Box::new(dynified)
}

fn main() {
    let mut registry = Registry::new();
    // something must generate this code for each *input type* desired
    registry.register::<i32, Box<dyn Display>>(into_dyn_display::<i32>);
    registry.register::<String, Box<dyn Display>>(into_dyn_display::<String>);
    
    let values: Vec<Box<dyn Any>> = vec![Box::new(1), Box::new(String::from("two"))];
    for value in values {
        let coerced = registry.coerce::<Box<dyn Display>>(value);
        println!("{coerced}");
    }
}

Hey @kpreid , thanks for your help! Sorry for late reply, it took me some time to digest this, and integrate into a bigger tooling to verify that it's actually useable end-to-end.

I've pushed this work into a repo on Github, registry implementation can be found here. ocaml-rs-smartptr is a project that aims to provide a smart pointer for ocaml-rs, that allows to define bindings for traits in one place and then automagically coerce other types so that those trait bindings are useable with them (subject to manual registration of type/trait relashionship).

I gave a try to leverage cargo doc json output to auto-generate the registration but decided to not go that way for now, it seems complicated/fragile, although cargo doc json output seems like a promising source of information for this kind of stuff.

Another notable crate solving similar problem that is worth mentioning is reffers-rs, it solves similar problems but for Lua-scripted game engine. referrs-rs itself is not Lua-specific and can be used for any ffi probably. I decided to go with implementation based on above answer, as referrs-rs depends on a bunch of unstable Rust features and seems to be fragile solution in the long run because of that. Solution suggested above uses stable Rust and does not depend on unsafe code, which sounds better for me.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.