Advice on how to design an introspection system in Rust

Hi!

Could you please advice me on how I can approach the following problem.

I want to develop a script system that will be capable to interact with Rust crates. Script users should be able to call into Rust functions, and the Rust users should be able to export their Rust functions and types into the script engine.

I would like to develop the exporting infrastructure as seamless and handy for the Rust crate developers as possible. The end result of the export is a single function per user's crate exposing the entire crate introspection data.

The scripting language that I'm developing has relatively limited semantic to the Rust language semantic. So the Rust developers will be able to export only a limited subset of Rust functions and types, but this is intended. And the scope of my question is about collecting of the Rust code introspection metadata only.

To organize the introspection I think I can implement a proc macro that the Rust crate developer will apply to their Rust code constructions(structs, functions, trait impls etc). The output of this macro will be some kind of a function that will return an introspection metadata for applicable part of the code. I have an imagination of how I can implement such macro, but I have no idea how do I enforce/automate collecting of calling of these functions into a single collection of metadata per the entire user's crate.

The problem is that I don't know how to organize a common shareable context for all macro invocations within the user crate code. Rust proc macro system designed/intended to be free of compile-time side-effects. In this sense having a single shareable compile-time collection(either in shared memory or in a file) for macro invocations is probably a bad idea. Alternative approach would be enforcing the macro to generate a code that would later be used in run-time. But this approach in turn is limited by the fact that the Rust crate lib cannot run a code before main()(unless I go with ctor manipulations hacks that would in turn violate Rust's convenient execution module). Finally, I also don't want to enforce end-crate developers to implement a build.rs pre-processing programs too. IMO, that would confuse the users.

So, I'm stuck with this problem. To sum up, I know how to generate introspection metadata, but I don't know how to collect this metadata into a single point per crate.

P.S. I don't mind about changing the entire approach if there are any good alternatives to the initial problem.

The approach I would actually prefer (in any other language/"plugin" system as well) is to explicitly list the functions that the crate developer is willing to expose. This would prevent programmers from accidentally leaking any private code or data, and is less magical (== easier to debug).

If you don't want that anyway, then you can let the proc-macros accumulate state and metadata e.g. in a JSON file at compile-time, which you would then bundle with the compiled plugin and deserialize at load-time.

3 Likes

is to explicitly list the functions that the crate developer is willing to expose

I was thinking to go this way too. But the main drawback here is code maintenance. The crate developer can easily forget to list something.

You can use @dtolnay’s linkme to collect items from across the entire program into a static slice.

Aside: I have a draft blog post doing something similar; let me see if I can get it ready…


Edit: I've made the draft available; it describes the entire development of an extenstion language like you're writing. Unfortunately, I don't have the code posted anywhere public yet. The most interesting part of the code for you is probably the macro definitions below; I think everything else in the article should be clear enough without the code available.

/// Derive the `Dispatch` trait for a type, allowing commands to be added
/// with the `register!` macro
#[macro_export]
macro_rules! derive_dispatch {
    ($viz:vis $ty:ident $(-> $status:ty)? $(; errfmt(&$self:tt, $($errfmt:tt)+))?) => { $crate::paste::paste! {
        #[$crate::linkme::distributed_slice]
        $viz static [<$ty:upper _COMMANDS>]: [$crate::Cmd<$ty>] = [..];
        impl $crate::Dispatch for $ty {
            $crate::derive_dispatch!(@statusty $($status)?);
            fn dispatch(&mut self, cmdline: &str)->$crate::Status<Self::Status> {
                $crate::dispatch(&*[<$ty:upper _COMMANDS>], self, cmdline)
            }

            $(
                fn err_ctx(&$self)->String {
                    format!($($errfmt)+)
                }
            )?
        }
    }};
    (@statusty) => { type Status = ::std::ops::ControlFlow<()>; };
    (@statusty $t:ty) => { type Status = $t; };
}

/// Register new commands with a command parser created with `derive_dispatch!`
#[macro_export]
macro_rules! register {
    (<$mod:path> :: $target:ident => $(
        $(#[doc=$help:literal])* $key:ident ( $tname:ident $(, $arg:ident : $argty:ty)* ) $body:block
    )*) => {$($crate::paste::paste! {
        #[$crate::linkme::distributed_slice($mod::[<$target:upper _COMMANDS>])]
        static [<$target:upper _COMMANDS_ $key:upper>]: $crate::Cmd<$mod::$target> = $crate::Cmd {
            key: stringify!($key),
            args: stringify!($($arg:$argty),*),
            help: concat!($($help,'\n'),*),
            cmd: |$tname:&mut $mod::$target, mut args:&str| {
                $(
                    let $arg = match <$argty as $crate::ParseArg>
                                     ::parse_arg(&mut args) {
                        Ok(x) => x,
                        Err(reason) => {
                            return $crate::Status::err(&*$tname,
                                format_args!("Error parsing argument {}: {:?}",
                                             stringify!($arg), reason));
                        }
                    };
                    args = args.trim();
                )*
                if args.len() > 0 {
                    return $crate::Status::err(&*$tname, format_args!("Unexpected arguments {:?}", args));
                }
                #[allow(unused_braces)]
                $body
            }
        };
    })*};

    // Unspecified path means self
    ($target:ident => $($rest:tt)*) => {
        $crate::register! { <self>::$target => $($rest)* }
    };
}
2 Likes

The approach you proposed has certain drawbacks, but this is the best automated way I discovered so far. And I'm going to go the same way too. Thank you for the idea, and for detailed explanation too!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.