Advice on how to design an introspection system in Rust

Eliah-Lakhin · February 24, 2022, 5:57am

Hi!

Could you please advice me on how I can approach the following problem.

I want to develop a script system that will be capable to interact with Rust crates. Script users should be able to call into Rust functions, and the Rust users should be able to export their Rust functions and types into the script engine.

I would like to develop the exporting infrastructure as seamless and handy for the Rust crate developers as possible. The end result of the export is a single function per user's crate exposing the entire crate introspection data.

The scripting language that I'm developing has relatively limited semantic to the Rust language semantic. So the Rust developers will be able to export only a limited subset of Rust functions and types, but this is intended. And the scope of my question is about collecting of the Rust code introspection metadata only.

To organize the introspection I think I can implement a proc macro that the Rust crate developer will apply to their Rust code constructions(structs, functions, trait impls etc). The output of this macro will be some kind of a function that will return an introspection metadata for applicable part of the code. I have an imagination of how I can implement such macro, but I have no idea how do I enforce/automate collecting of calling of these functions into a single collection of metadata per the entire user's crate.

The problem is that I don't know how to organize a common shareable context for all macro invocations within the user crate code. Rust proc macro system designed/intended to be free of compile-time side-effects. In this sense having a single shareable compile-time collection(either in shared memory or in a file) for macro invocations is probably a bad idea. Alternative approach would be enforcing the macro to generate a code that would later be used in run-time. But this approach in turn is limited by the fact that the Rust crate lib cannot run a code before main()(unless I go with ctor manipulations hacks that would in turn violate Rust's convenient execution module). Finally, I also don't want to enforce end-crate developers to implement a build.rs pre-processing programs too. IMO, that would confuse the users.

So, I'm stuck with this problem. To sum up, I know how to generate introspection metadata, but I don't know how to collect this metadata into a single point per crate.

P.S. I don't mind about changing the entire approach if there are any good alternatives to the initial problem.

H2CO3 · February 24, 2022, 6:23am

The approach I would actually prefer (in any other language/"plugin" system as well) is to explicitly list the functions that the crate developer is willing to expose. This would prevent programmers from accidentally leaking any private code or data, and is less magical (== easier to debug).

If you don't want that anyway, then you can let the proc-macros accumulate state and metadata e.g. in a JSON file at compile-time, which you would then bundle with the compiled plugin and deserialize at load-time.

Eliah-Lakhin · February 24, 2022, 6:42am

is to explicitly list the functions that the crate developer is willing to expose

I was thinking to go this way too. But the main drawback here is code maintenance. The crate developer can easily forget to list something.

2e71828 · February 24, 2022, 7:16am

You can use @dtolnay’s linkme to collect items from across the entire program into a static slice.

_{Aside: I have a draft blog post doing something similar; let me see if I can get it ready…}

Edit: I've made the draft available; it describes the entire development of an extenstion language like you're writing. Unfortunately, I don't have the code posted anywhere public yet. The most interesting part of the code for you is probably the macro definitions below; I think everything else in the article should be clear enough without the code available.

/// Derive the `Dispatch` trait for a type, allowing commands to be added
/// with the `register!` macro
#[macro_export]
macro_rules! derive_dispatch {
    ($viz:vis $ty:ident $(-> $status:ty)? $(; errfmt(&$self:tt, $($errfmt:tt)+))?) => { $crate::paste::paste! {
        #[$crate::linkme::distributed_slice]
        $viz static [<$ty:upper _COMMANDS>]: [$crate::Cmd<$ty>] = [..];
        impl $crate::Dispatch for $ty {
            $crate::derive_dispatch!(@statusty $($status)?);
            fn dispatch(&mut self, cmdline: &str)->$crate::Status<Self::Status> {
                $crate::dispatch(&*[<$ty:upper _COMMANDS>], self, cmdline)
            }

            $(
                fn err_ctx(&$self)->String {
                    format!($($errfmt)+)
                }
            )?
        }
    }};
    (@statusty) => { type Status = ::std::ops::ControlFlow<()>; };
    (@statusty $t:ty) => { type Status = $t; };
}

/// Register new commands with a command parser created with `derive_dispatch!`
#[macro_export]
macro_rules! register {
    (<$mod:path> :: $target:ident => $(
        $(#[doc=$help:literal])* $key:ident ( $tname:ident $(, $arg:ident : $argty:ty)* ) $body:block
    )*) => {$($crate::paste::paste! {
        #[$crate::linkme::distributed_slice($mod::[<$target:upper _COMMANDS>])]
        static [<$target:upper _COMMANDS_ $key:upper>]: $crate::Cmd<$mod::$target> = $crate::Cmd {
            key: stringify!($key),
            args: stringify!($($arg:$argty),*),
            help: concat!($($help,'\n'),*),
            cmd: |$tname:&mut $mod::$target, mut args:&str| {
                $(
                    let $arg = match <$argty as $crate::ParseArg>
                                     ::parse_arg(&mut args) {
                        Ok(x) => x,
                        Err(reason) => {
                            return $crate::Status::err(&*$tname,
                                format_args!("Error parsing argument {}: {:?}",
                                             stringify!($arg), reason));
                        }
                    };
                    args = args.trim();
                )*
                if args.len() > 0 {
                    return $crate::Status::err(&*$tname, format_args!("Unexpected arguments {:?}", args));
                }
                #[allow(unused_braces)]
                $body
            }
        };
    })*};

    // Unspecified path means self
    ($target:ident => $($rest:tt)*) => {
        $crate::register! { <self>::$target => $($rest)* }
    };
}

Eliah-Lakhin · February 26, 2022, 8:25pm

The approach you proposed has certain drawbacks, but this is the best automated way I discovered so far. And I'm going to go the same way too. Thank you for the idea, and for detailed explanation too!

Topic		Replies	Views
Best way to implement a language inside Rust help	6	888	December 28, 2020
Compile-time introspection: when did you need it, how did you solve the problem without it?	1	226	June 24, 2025
Macro to collect metadata help	10	959	May 12, 2022
Derive procedural macros using Lua scripts announcements	7	752	March 10, 2024
Macro for auto deriving sharing code w/ scripting language	0	312	September 20, 2021

Advice on how to design an introspection system in Rust

Related topics