How to dynamically build function calls in Rust

How do I take a vector of function argument AST variants, extract the values, and use them to instantiate a function call?

I am writing an interpreter that evaluates certain expressions. Some of the expressions are function calls. I am having a hard time figuring out how to translate the function calls AST to the actual call. The AST gives me the function name and a vector of arguments. I can lookup the function pointer to call from the name using a map, but passing the arguments to the fp is problem. Rust does not have a splat operator (argument expansion). I could pass them as a tuple and use destructuring of the arguments, but I can't figure out how to convert the vector of AST argument enum variants to a tuple of the concrete types.

I can't simply map or loop over the AST arguments to extract the values and produce a tuple.

I can use nested tuples to build a heterogenous list incrementally:

fn prepend<I,T>(i: I, t: T) -> (I,T) { (i, t) }

fn foo() {
    let x = ();
    let x = prepend(1, x);
    let x = prepend(2.0, x);
    let x = prepend(true, x);
}

But that only works because x gets shadowed and the new binding has a different type. This won't work:

fn foo() {
    let mut x = ();
    x = prepend(1, x);
    x = prepend(2.0, x);
    x = prepend(true, x);
}

Any ideas?

There are implementations of HList in Rust for example in hlist or frunk, you might want to take a look how they did it.

Thanks. I have already looked at them and they suffer from the same issue as internally they are implemented as nested tuples. E.g. a call to HList.prepend returns a value with a new type.

The thing that you describe trying to do is completely impossible in rust, because rust compiles down to machine code. Looking at the fully compiled program, one can't even tell how many arguments each function takes.

pub fn two_arg(a: f64, b: f64) -> f64 {
    a + b
}

pub fn one_arg(tup: (f64, f64)) -> f64 {
    tup.0 + tup.1
}

The optimized x86_64 assembly for these functions is identical:

example::one_arg:
        addsd   xmm0, xmm1
        ret

example::two_arg:
        addsd   xmm0, xmm1
        ret

So let's take a couple steps back: What is the real problem you're trying to solve?

2 Likes

I am writing a language that that matches events. E.g. assuming some event schema, match the event against something like e.foo == 1 and e.bar < 10.0 or e.fox = "nope". Such expression are not hard to implement, as the operators are binary operators and the types they support are limited (e.g. < is only supported on ints and floats). I already have something like that working.

But I'd also like to support (built-in) functions for more complex operations. E.g. start_with(e.name, "foo") and e.bar == 1.

This is where I am stuck. I have an AST representation of the function: a function name and a vector of AST argument variants. I can maintain a map of built-in functions to function signature metadata, such as argument number and types. I can maintain maps for different signatures to lookup the function pointers. But I am stuck taking that vector of AST argument variants and converting it to a tuple of values and event field accessors that I can use to build a closure that takes the events, calls the accessors to obtain the required field values, and calls the function.

Okay, so if I understand correctly, the functions you're trying to call are functions defined in your language? Not functions defined in Rust?

Edit: Nope, just noticed the word "built-in." Maybe I don't understand.

No. They are defined in Rust. The language does not allow for function definition, just calling the built in functions.

Can you show what one of these functions looks like currently?

(There are ways to handle this in a very dynamic fashion, but it could be cumbersome and I still want to see if there's a better way first!)

Note: It still won't be possible to just call plain rust functions that take n arguments. No matter what approach is taken, they'll need to be rewritten somehow. My hope is that it can be done with a macro.

The be a bit more specific, given a struct Function(String,Vec<Arg>) and

enum Arg {
    Attribute(String)
    Bool(bool),
    Float(f64),
    Int(i64),
    Str(String)
}

and a set of build-in functions, such as something simple like fn len(s: &str) -> i64 { s.len() as i64 }, then how to translate a Function("len".to_owned(),vec![Arg::Str("foo".to_owned())]) to something like |e: &Event| len(arg). This is a simple example because the argument is a literal, instead of an event field, which would require an accessor.

One possible solution is to have a match expression on the function name. E.g.

let Function(name, args) = function;
match name {
  "len" => {
    if args.len() != 1 {
      return Error(Error::WrongNumberOfFunctionArguments());
    }
    let val = match args[0] {
      Arg::Str(val) => val,
      _ => return Error(Error::WrongArgumentType()),
    }
    move |e: &Event| len(&val)
  },
  ...
}

Or something along those lines. But that would be incredible tedious and hard to maintain as you add more build-in functions. So I am looking for a more generic way to do so, by using function pointers and maps, but going from the AST argument vec to the call with arguments has me stumped.

Okay, I see, the trouble is factoring out the logic that validates the arguments.

Here's the first idea that comes to mind:


The goal will be to let our functions look something like this:

fn len(args: Vec<Arg>) -> Result<Arg, Error> {
    let hlist_pat![s]: Hlist![String] = FromArgs::from_args(args)?;

    Ok(Arg::Int(s.len() as i64))
}

// example with two arguments
fn add(args: Vec<Arg>) -> Result<Arg, Error> {
    let hlist_pat![a, b]: Hlist![i64, i64] = FromArgs::from_args(args)?;

    Ok(Arg::Int(a + b))
}

(you might have your own output type, I just reused Arg as that's what was available to me)

Here, all of the boilerplate is reduced to one line, by introducing a trait called FromArgs. This trait is implemented for any HList consisting of Strings, i64s, bools, and f64s. I used HList because that makes it easier to support arbitrary numbers of arguments. (If you wanted to do tuples you'd have to implement FromArgs separately for each tuple size, which would be annoying)

The implementation of this trait is simple:

/// Parses a single argument by checking its type and emitting
/// an error on type mismatches.
pub trait FromArg {
    fn from_arg(arg: Arg) -> Result<Self, Error>;
}

impl FromArg for String { ... }
impl FromArg for bool { ... }
impl FromArg for i64 { ... }
impl FromArg for f64 { ... }

/// Parse a list of arguments.
pub trait FromArgs {
    fn from_args(args: Vec<Arg>) -> Result<Self, Error>;
}

// validate 0 args
impl FromArgs for HNil {
    fn from_args(args: Vec<Arg>) -> Result<HNil, Error> {
        if !args.is_empty() {
            return Err(Error::WrongNumberOfFunctionArguments);
        }
        Ok(HNil)
    }
}

// validate n args by validating 1, then the other (n-1)
impl<H: FromArg, T: FromArgs> FromArgs for HCons<H, T> {
    fn from_args(mut args: Vec<Arg>) -> Result<Self, Error> {
        if args.is_empty() {
            return Err(Error::WrongNumberOfFunctionArguments);
        }

        // this has quadratic cost overall but almost certainly the max
        // function arity is small enough that it should not matter...
        let first_arg = args.remove(0);

        Ok(HCons {
            head: H::from_arg(first_arg)?,
            tail: T::from_args(args)?,
        })
    }
}
2 Likes

Thanks for the suggestion. I'll give it a try. One thing I was trying to do is avoid is passing in the Vec<Arg> into the function as an argument, and instead pass in a tuple or HList. That was both to avoid exposing the AST layer to the function layer, and perhaps misguidedly, to avoid having to repeatedly extract the arguments from the AST arguments, as this is meant to match events at a high rate. But maybe that is unavoidable.

Hm, I see. It should be fairly easy to make the design closer to that. I don't think the repeated extraction is unavoidable, it's just a quirk of how I wrote it.

match name {
    // a macro could easily generate these branches
    "str" => {
        let args = FromArgs::from_args(args)?;
        move |e: &Event| len(e, frunk::traits::ToRef::to_ref(&args));
    },
    "add" => {
        let args = FromArgs::from_args(args)?;
        move |e: &Event| add(e, frunk::traits::ToRef::to_ref(&args));
    },
}

fn len(_: &Event, hlist_pat![s]: Hlist![&String]) -> Result<Arg, Error> {
    Ok(Arg::Int(s.len() as i64))
}

fn add(_: &Event, hlist_pat![&a, &b]: Hlist![&i64, &i64]) -> Result<Arg, Error> {
    Ok(Arg::Int(a + b))
}

A couple of notes:

  • I tried to avoid cloining the String, to match your original behavior more closely. That was tricky, and led to most of the other oddities in this list.
  • .to_ref() turns &Hlist[A, B, C] into Hlist![&A, &B, &C].
  • I used frunk::traits::ToRef::to_ref(&args) because args.to_ref() would require a type annotation, and a macro wouldn't be able to tell whether it needs to use HNil or HCons<_, _> (it would need to know the function arity)
  • Notice I used the type &String, which is normally considered unidiomatic in favor of &str; I did this so that type inference could still deduce the correct types at the FromArgs::from_args lines.

Also, now that I look at what I've written, hlist_pat looks considerably noisier in argument position than I thought it would. Maybe tuples are worth the extra implementation effort:

fn len(_: &Event, (&a, &b): (&i64, &i64)) -> Result<Arg, Error> {
    Ok(Arg::Int(a + b))
}

I'm currently writing such kind of expressions myself and took another approch. One that's in my opinion more idomatic by leveraging Rust's typesystem.

First of all I built a type for each built-in function, in your case e.g.:

struct Len {
    arg: Box<Expression>,
}

For these kind of structs a trait for evaluation is created and implemented:

trait ExpressionStruct {
  fn evaluate(&mut self, e: &Event) -> Arg;
}

impl ExpressionStruct for Len {
  fn evaluate(&mut self, _: &Event) -> Arg {
    Arg::Int(len(&self.arg))
  }
}

// impl for const expressions like a single bool
impl ExpressionStruct for bool {
  fn evaluate(&mut self, _: &Event) -> Arg {
    Arg::Bool(*self)
  }
}

And finally an enum that wrapps all kinds of expressions:

enum Expression {
  BoolConst(bool),
  FloatConst(f64), 
  ...
  Len(Len),
  ...
}

impl Expression {
  fn evaluate(&mut self, e: &Event) -> Arg {
    match self {
      BoolConst(inner) => inner.evaluate(e),
      ...
      Len(inner) => inner.evaluate(e),
      ...
    }
  }
}

This way you can type-check the arguments at construction of the various expressions and I think this is much easier to reason about.

2 Likes

Meanwhile, I dug further into leveraging Rust's type system for building dynamic expressions.

My problem was that on each input for a ExpressionStruct the evaluate() had to perform a match on the Arg enum. Since these expression types won't change after the built this match should be removeable.


I came up with the following solution: Use concrete types as long as possible. To stick with the example of this topic, we assume that expressions can be of the types bool, String and i64.

First of all I use an Expression trait that is generic over its output:

trait Expression<T: ExpType> {
    type Context; // to include the example's Event input. Will be omitted for the rest.

    fn evaluate(&mut self, c: &Self::Context) -> T;
}

The ExpType is a marker trait for types that can be handled by Expression's, i.e.

trait ExpType: 'static + Clone + Debug + Into<ExpValue> {}
impl ExpType for bool {}
impl ExpType for String {}
impl ExpType for i64 {}

And ExpValue is used to abstract over all these ExpTypes:

enum ExpValue {
    Bool(bool),
    Int(i64),
    Str(String),
}

Second the expression structs are created as formerly proposed:

struct Len {
    arg: StringExpression,
}

impl Expression<i64> for Len {
    fn evaluate(&mut self) -> i64 {
        self.arg.evaluate().len() as _
    }
}

Third the typed expression enums are constructed, e.g. the StringExpression enum that is used by Len
Finally it is usually required to store expressions of different types in one collection, hence the Arg enum in the first place. In order to solve this, an additional GeneralExpression enum is built to give an uniform interface:

enum GeneralExpression {
    Bool(BoolExpression),
    Integer(IntegerExpression),
    Str(StringExpression),
}

impl Expression<ExpValue> for GeneralExpression {
    fn evaluate(&mut self) -> ExpValue {
        match self {
            GeneralExpression::Bool(inner) => inner.evaluate().into(),
            ...
        }
    }
}

All in all this is a lot of boilerplate to write but I think its worth it. The static typing removes as many unnecessary computations as possible, so this should be pretty fast for runtime evaluated dynamic expressions. In addition, one is free to use a concrete typed expression when ever required, e.g. when implementing an IfClause.
A lot of the boilerplate could be generated by macros. I'm planing to write a crate for it when I have more time to spend.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.