How do I implement an external trait for an external struct?

Apologies for newbie question, I am still working my way through the Rust book and paradoxically, so far the borrow system hasn't been an issue, but serialising things to Erlang NIFs definitely is.

I am using rustler and I have code like this:

fn close<'a>(env: Env<'a>, args: &[Term<'a>]) -> Result<Term<'a>, Error> {
    match args[0].decode::<rusqlite::Connection>() {

Which predictably blows up with:

error[E0277]: the trait bound `rusqlite::Connection: rustler::Decoder<'_>` is not satisfied
  --> src/lib.rs:68:19
   |
68 |     match args[0].decode::<rusqlite::Connection>() {
   |                   ^^^^^^ the trait `rustler::Decoder<'_>` is not implemented for `rusqlite::Connection`

A quick search tells me Rust does not offer implementing traits for remote types (not in the local crate / project) and one offered solution[0] is to wrap the remote struct in my own and do #[derive rustler::Decoder] on that.

Is that the only option? Are there other ways I could achieve my desired result -- namely respect rustler's conventions on how to serialise a Rust type to Erlang/Elixir?

Apologies, I realise this is a rather novice question.

[0] https://stackoverflow.com/questions/25413201/how-do-i-implement-a-trait-i-dont-own-for-a-type-i-dont-own

You can add your own type. That said, I'm a bit puzzled: Are you trying to deserialize something into a database connection object?

I am just trying to return the rusqlite::Connection structure to Elixir code since I'd like to later be able to use it in all sorts of sqlite3 DB operations from there again. (Basically, I am writing a rustler wrapper for rusqlite.)

Everything I might be doing stupid during the process is accidental due to my focus on this task which I so far can't achieve.

In the OP example I am trying to deserialise a DB connection object from parameters which should already be passed to the close function. I should have probably copy-pasted by open function code since I kind of started backwards with this topic:

fn open<'a>(env: Env<'a>, args: &[Term<'a>]) -> Result<Term<'a>, Error> {
    match rusqlite::Connection::open_in_memory() {
        Ok(conn) => {
            Ok((atoms::ok(), conn).encode(env))
        },
        Err(err) => {
            Ok((atoms::error(), format!("{}", err)).encode(env))
        },
    }
}

Which errors with:

error[E0599]: no method named `encode` found for type `(rustler::Atom, rusqlite::Connection)` in the current scope
  --> src/lib.rs:51:36
   |
51 |             Ok((atoms::ok(), conn).encode(env))
   |                                    ^^^^^^ method not found in `(rustler::Atom, rusqlite::Connection)`
   |
   = note: the method `encode` exists but the following trait bounds were not satisfied:
           `(rustler::Atom, rusqlite::Connection) : rustler::Encoder`

You will have to create a wrapper type for your connection type and implement the trait for that type. I don't know how that interacts with encoding a DB connection — you may have to resort to just writing the connection string and establishing a new connection on decode.

The rusqlite::Connection struct has 3 fields that are also structs inside the same library. Do you think I might fall into a rabbit hole chasing this? I don't mind hard work, just wondering if I might end up in a doomed pursuit.

Using #[derive Decoder] didn't work and since I still don't understand traits, the derive attribute and procedural macros I'll probably have to just keep working my way through the book until I have a better grasp of Rust.

I am also not at all sure how can I actually manually serialise / deserialise structs when doing impl Decoder for MyOwnStructContainingExternalOne and since I am not looking for somebody to do my homework then I guess it's back to the drawing board. :slight_smile:

Thanks for being with me so far.

I do not think trying to encode the connection itself sounds like a good idea. I don't know much about NIFs, but it sounds like you should aim for something like giving Erlang a raw pointer to a Rust box, and having Erlang call Rust methods with this raw pointer as an argument, and then have Rust access the contents.

2 Likes

This makes sense; I thought and still think that I am likely missing something fairly obvious in regards to abstracting away the access to the native Rust resource as opposed to trying to [de]serialise it and pass it around -- which is likely hugely unsafe and dangerous. But I still have no clue as to how so I'll just pause the project and keep working on my Rust education.

As for Erlang NIFs, they are nothing more than a standardised C interface and rustler's value-add is putting some Rust FFI on top so no crash can ever occur -- and using Rust instead of C, of course. :smiley: Nothing exotic really.

I couldn't find anything in rustler that allowed exposing a rust function to erlang, so I'm not quite sure how they should interact. If you can set up a shim of Rust functions that Erlang can call, I can show you how to do the unsafe layer with a box.

I don't want to make this very long but here's the formulaic HOWTO example:

#[macro_use] extern crate rustler;

use rustler::{Encoder, Env, Error, Term};

mod atoms {
    rustler_atoms! {
        atom ok;
    }
}

rustler::rustler_export_nifs! {
    "Elixir.Xqlite.RusqliteNif", // Name of the Elixir module
    [
        ("add", 2, add),
    ],
    None
}

fn add<'a>(env: Env<'a>, args: &[Term<'a>]) -> Result<Term<'a>, Error> {
    let num1: i64 = args[0].decode()?;
    let num2: i64 = args[1].decode()?;

    Ok((atoms::ok(), num1 + num2).encode(env))
}

That's it. Now the Xqlite.RusqliteNif Elixir module has an add function that actually serialises arguments to Rust's runtime, the Rust add function is called and the response is encoded back for consumption by the Erlang/Elixir runtime.

I can't be of help in deciphering rustler, sadly. Here's one relatively quick article: https://hansihe.com/2017/02/05/rustler-safe-erlang-elixir-nifs-in-rust.html

But please, don't invest much time in this. There are several moving parts involved and getting to know them can prove to take more time than you'd like to give to a random forum post.

(I believe the shim you're looking for in this case is the Rust add function.)

Basically the idea is that you generate your data and put it in a Box, and then you call into_raw, which produces a raw pointer.

You can now pass this pointer to Erlang using some method that doesn't allow Erlang to inspect the pointer itself. Then when Erlang wants to use the pointer, it's passed to Rust, which uses a cast to create a pointer to the data type, and then uses unsafe to dereference it.

When you are done, you can destroy it by converting it back into a box with from_raw and dropping the box.

I don't see how I would provide the opaque pointer to Erlang looking a bit around in the Term type, but the idea of the approach is as described above.

2 Likes

That's part of the issue though; I'd like the DB connection struct itself to live until I call close (at which point rusqlite destroys its associated native resources).

Is what you are suggesting only going to create / destroy containers carrying the connection struct between the language barriers, or is it going to create / destroy it itself as well? Wasn't sure from your answer.

You create the DB connection itself when you create a box, and when you turn it back into a box on close, dropping the box will also drop the DB connection.

Yeah, no good.

Could this be the answer, how do you think? https://docs.rs/rustler/0.20.0/rustler/resource/struct.ResourceArc.html

1 Like

Yep, that looks good. You use the ResourceArc instead of the box I proposed. It looks like you'll need to use the resource_struct_init! macro to be allowed to pass the data struct to the ResourceArc, and the documentation seems rather sparse ... :sweat_smile:

That's my biggest issue so far. :frowning: The documentation is sparse and I have to rely on parsing the Rust code myself which I am still pretty bad at.

Last thing and I hope you don't mind. Could you post [pseudo-]code on how would you go about using that struct and its procedural macro in the case of the open and close functions above? I'd like to produce a working example for opening and closing an sqlite3 connection while passing the connection back to Elixir and then use Elixir again to call close with that serialised connection as a parameter.

EDIT: Even if you don't -- and you really don't have to -- I am very grateful for the discussion.

I have not tested this, but just based on the previous snippet with add:

#[macro_use]
extern crate rustler;

use rustler::{Encoder, Env, Error, Term};

mod atoms {
    rustler_atoms! {
        atom ok;
    }
}

rustler::rustler_export_nifs! {
    "Elixir.Xqlite.RusqliteNif", // Name of the Elixir module
    [
        ("new_db", 1, new_db),
        ("use_db", 1, use_db),
    ],
    None
}

struct RustqliteData {
    conn: rusqlite::Connection
}

fn into_rustler_err(err: rusqlite::Error) -> Error {
    ...
}

fn new_db<'a>(env: Env<'a>, args: &[Term<'a>]) -> Result<Term<'a>, Error> {
    let path: String = args[0].decode()?;
    
    let conn = rusqlite::Connection::open(&path).map_err(into_rustler_err)?;
    let data = RustqliteData {
        conn
    };
    
    resource_struct_init!(RustqliteData, env);
    
    let arc = ResourceArc::new(data);

    Ok((atoms::ok(), arc).encode(env))
}

fn use_db<'a>(env: Env<'a>, args: &[Term<'a>]) -> Result<Term<'a>, Error> {
    let data: ResourceArc<RustqliteData> = args[0].decode()?;

    // use data here

    Ok((atoms::ok(), 0).encode(env))
}

You do not need to explicitly provide drop code in this case — when the garbage collector collects the last instance, the connection will be automatically dropped. That said, you can replace the conn field with an Option<rusqlite::Connection> and replace it with None in close, if you wish to provide an explicit close method.

Additionally if you wish to modify anything inside the Data struct, you will need to wrap it in a Mutex, as Erlang may call it from multiple threads simultaneously.

3 Likes

Erlang will indeed call it from multiple threads but I'm pretty sure it won't modify it so I should be fine.

I can't say I understood everything that you said but you've given me a ton of stuff to think about and try in code.

Thanks a lot! I'll be trying things and will report back with findings and what worked in the end.

1 Like

I've just seen another guy use serde_rustler to deal with serialisation but he wasn't actually having my scenario (namely working with native resources). Still, it's one more venue I'll likely explore.

Your problem is that serializing the Connection object tries to turn a live database connection — with lots of internal state, pointers, locks, open file handles — into a flat bunch of dumb bytes that do nothing, and then Rust would probably close the actual connection for you (because the serialized copy doesn't keep ownership of the original). So you'd be left with a xeroxed copy of what used to be an internal state of a database connection, and no db connection.

What you need to do is to keep the Connection object on the Rust side and never send it to the Erlang side. The only thing Erlang needs to know is which connection object it's talking about, and the "which" question can be answered with a raw pointer value, which you can get from a Box or Arc or similar. It'll be just a 64-bit number, which is trivial to serialize and won't destroy the connection. You could also have a global array of Rust connections and send an integer to Erlang that is an index in that array (but that's just an example, in practice sending pointers is easier to manage).

3 Likes

Yes, this is what you want for the BEAM VM to hold onto arbitrary data and call a destructor on it when the BEAM VM is done with. On the BEAM VM a resource is an opaque value, only useful to be passed back through NIF's to perform work on them.

What @alice put in their recent post with the sinppet looks correct on an initial look.

The BEAM VM can and will call things from many threads as it's task style actor system distributes workers with impunity. But no, it knows nothing about resources beyond "It's a pointer to data and I should call this callback function on it, passing it in along with my own environment, when I am done with it (GC'd)".