Callback functions, context pointers

I’m writing an argument parser (don’t worry, this won’t become Yet Another Argument Parser on crates.io – it’s just for learning Rust).

The C++ library I’m trying to mimic (primarily interfacewise, not necessarily codewise) uses a ‘Spec’-class which is used as a base class to derive switch/argument specific classes. The base class has two virtual functions which are meant to be overridden. One is called “proc_opt” and one is called “proc_arg”. The “opt” version doesn’t accept any arguments (i.e. is only used for command line switches without arguments) and the “arg” call includes an vector of strings which will contain the arguments to the option.

To illustrate:

class HelpSwitch : public Args::Spec
{
  HelpSwitch() : Args::Spec('h', "help") {
  }
  void proc_opt() override {
    // process -h, --help
  }
}

class FileOpt : public Args::Spec
{
  FileOpt() : Args::Spec('f', "file") {
    nargs_ = 1;
  }
  void proc_arg(const std::vector<std::string>& args) override {
    // process -f FILE, --file=FILE
  }
}

So if a program is called using --help the first proc_opt will be called, and if a program is called using --file=foo.txt then proc_arg will be called and args[0] will be “foo.txt”.

This clearly isn’t isn’t the whole story though. There’s a driver object which is created from a “Args::Parser” class. The application creates a Parser object then adds these Spec-objects to the Parser object. One crucial part omitted from the prototypes above is that the parser also keeps track of an application context pointer which is passed to the handler functions. I.e:

MyContext myctx;
Args::Parser prsr(&myctx);
// .. add Spec objects to prsr ..
prsr.parse(argc, argv)

When the parser is run it’ll pass the context pointer to the handlers, i.e. to modify the earlier examples:

  // ...
  void proc_opt(void *ptr) override {
    MyContext *ctx = (MyContext*)ptr;
    ctx->do_help = true;
  }
  // ...
  void proc_arg(void *ptr, const std::vector<std::string>& args) override {
    MyContext *ctx = (MyContext*)ptr;
    ctx->fname = args[0]
  }
  // ...

I have two questions relating to this.

First, the Parser object will obviously use the Spec object’s nargs to determine which callback to call. But how would one go about implementing this callback to Rust? I had a theory that I would use function pointers; they seem easy enough. But say I would like to pass a reference to the Spec object to the callback, then I end up with this conundrum:

  pub struct Spec {
    opt: Option<char>,
    lopt: Option<String>,
    name: Option<String>,
    nargs: usize,
  }

  type SwitchHandler = fn(spec: &Spec);
  type ArgHandler = fn(spec: &Spec, args: Vec<String>);

Now I have two handler “function pointer” types, but I need to place instances of them in the Spec object – but I need Spec in order to create the function pointer type. “That’s easy, I’ll just forward declare Spec” I thought, but after a quick google I don’t think there is forward declaration of structs in Rust?

Another way I thought about doing it was using traits; one for the proc_opt and one for proc_arg, but I’m not sure that’s feasible or a good idea.

The second question relates to the application-specific context buffer. What would be an ideomatic way to pass an application-specific buffer from the application to the parser object and then on to the “proc_opt” and “proc_arg” functions? In C++ I just use a void pointer, and the application casts it to its own type in its handlers. On this one I’m completely stumped with regards to Rust. Again I was looking at using traits somehow, but traits are about methods and the context buffers are typically data-only, so it feels like going down the wrong path.

Rust is order-independent. It doesn’t have forward declarations because it doesn’t need them.

A Spec can hold functions that take Spec, just try it. Though this is a bit of an unusual approach; normally one would write a trait containing those functions as methods, and implement it for a type created on the spot. (Especially when there are multiple methods; then doing it that way allows the callbacks to share mutable state).

I’m not 100% sure what this is asking, but I’m going to guess that it’s what I just mentioned about allowing the functions to share state (or “context”). (Because I know void * must be used as a horrible workaround for this when writing e.g. a sort comparator in C)

If you have a trait implemented on a type, then that type can hold it’s own data, plain and simple.

1 Like

What would be an ideomatic way to pass an application-specific buffer from the application to the parser object and then on to the “proc_opt” and “proc_arg” functions? In C++ I just use a void pointer, and the application casts it to its own type in its handlers.

I agree with @ExpHP that its preferable to find another way that keeps compile time safety. However, if you need to make type decisions at runtime the std library has an any type:

https://doc.rust-lang.org/std/any/

One other possibility to consider is if you can just use normal objects and not function pointers. The rust book has a good example with the state pattern:
https://doc.rust-lang.org/book/ch17-03-oo-design-patterns.html

1 Like

What I’m trying to figure out a way to solve what that horrible void * solves in C/C++. :slight_smile:

Specifically, the issue is that the application will define its own “args context” (a struct, probably), but my library needs to store that context in its own context, and each time an argument callback is called the args context needs to be passed to callback functions so they can put the parsed arguments into the context. Once the parsing is done the application will use the data populated in the args context.

In the C++ code I simply take in a void *, and the application callback, since it’s the application’s own code, knows what type to cast that void * to:

// Application defined context
struct MyContex {
  bool do_help;
};

// Application creates context
MyContext ctx;
Parser prsr(argc, argv, (void*)&ctx);
prsr.parse()

// Application defined callback function
void HelpSwitch::proc_opt(void *arg) {
  auto ctx = reinterpret_cast<MyContext*>(arg);
  ctx->show_help = true;
}

I.e. the problem is essentially that my library will sit in-between two parts of the application (the callee and the callback), both of which need the “arg context”, so my library needs to take it in (to the parser), store it, or a reference to it, and be able to pass it to the callbacks. (And later, once the parsing is complete, return it to the application).

Unfortunately I only realize as much as “I’m thinking about this the wrong (C/C++) way”, but I’m too new to Rust to see a solution to it.

The preferable way I would like to solve it is with compile time safety, but I’m unsure if it’s possible; my library needs to hold the args context, and it can’t know what type it is since it’s defined in the application.

An earlier version of the parser worked very differently; each argument specification had information about its data type, how many arguments it could hold etc. This was easier in Rust, because the proc’s and all the data were internal to the library. But I realized this has a bunch of drawbacks; it didn’t allow the application the same level of flexibility. For instance if the application wanted an argument to be an enum the library would need to be changed to support this. Callbacks allow the application the freedom to determine how the data is handled – but it seems to have made it more complicated in Rust.

The any type looks promising; I’ll try to go that route and see where it leads me.

Okay, so here’s something to give you an idea of how to progress. In order to write this I had to guess some of the missing details (e.g. when is HelpSwitch created, how does it get associated with --help, and how/when do you add it to the parser).

I’ll just make a note upfront that to me it didn’t make any sense to me for Spec to both contain info like opt and lopt, while also containing separate callbacks that take &Spec. (in particular it seems to me that the callbacks would need to be highly coupled to the other info so there’s no point in having a design where the callbacks are interchangeable). So I just basically absorbed Spec into the Arg trait.

The types defined in the library are:

pub struct Parser<C> {
    context: C,
    // maps e.g. "-h" to the help arg
    args: HashMap<&'static str, Box<dyn Arg<C>>>,
    ... // other fields needed for parsing ARGV
}

pub trait Arg<C> {
    fn handle(&self, ctx: &mut C, args: Vec<String>);

    fn opt(&self) -> Option<String> { None }
    fn lopt(&self) -> Option<String> { None }
    fn name(&self) -> Option<String> { None }
    fn nargs(&self) -> usize { 0 }
}

impl<C> Parser<C> {
    pub fn new(argv: Vec<String>, context: C) -> Self;

    /// Register a new parseable option.
    pub fn arg<A: Arg<C> + Clone>(&mut self, arg: A);

    /// Parse and recover the modified context
    pub fn parse(self) -> C;
}

Here’s how the application uses it:

// Application defined context
struct MyContext {
    do_help: bool,
}

// Application creates default context
let argv: Vec<String> = { ... };
let initial_ctx = MyContext { do_help: false };
let mut parser = Parser::new(argv, initial_ctx);

/// Application registers all of its arguments and calls parse
parser.arg(HelpSwitch);
let final_ctx = parser.parse();

// Application-defined callback function
#[derive(Clone)]
struct HelpSwitch;

impl Arg<MyContext> for HelpSwitch {
    fn opt(&self) -> Option<String> { Some("-h".into()) }
    fn lopt(&self) -> Option<String> { Some("--help".into()) }
    fn handle(&self, ctx: &mut MyContext, _args: Vec<String>) {
        ctx.do_help = true;
    }
}

Notes

  • In Box<dyn Arg<C>>:

    • dyn Trait is what we call a trait object (Google that!); it’s how we do dynamic dispatch.
    • Box<_> is basically std::unique_ptr. Documentation
  • A dedicated callback like SwitchHandler is not needed. A length-zero Vec in rust does not allocate memory, so it is super cheap to make.

  • I chose to take context by value and return it from parse (rather than taking it by reference) in order to avoid introducing a lifetime. It’s nice to avoid lifetimes while you can, especially early on while you still have so many other parts of the language that you need to learn.

  • It feels to me like Context is not strictly necessary; we could alternatively write struct HelpSwitch { do_help: bool }, and then define proc_opt to take &mut self with no context. However, it would be trickier to write the implementation of Parser, in particular because (1) arg() would need to take &mut A thus introducing lifetimes, and (2) naively attempting to store the same &mut A under both -h and --help would result in aliasing, forbidden by the borrow checker (we have techniques to deal with this, but let’s save it for later!).

  • It is more idiomatic for Parser::new to take a generic type for argv, like

    impl<C> Parser<C> {
        pub fn new<I, S>(argv: I, context: C) -> Self
        where I: IntoIterator<Item=S>, S: AsRef<str>;
    }
    

    Not only would this accept iterators like std::env::args() as input, but it would also support various forms of borrowed values (e.g. &[String], Vec<&str>, &[&str]). However, for now I wanted to keep it simple.
    (note: Arg::handle does have to take Vec<String>; we can’t use a generic function because we are doing dynamic dispatch)

  • Similarly, Parser::parse should almost certainly return Result<C, SomeErrorType>; but error handling is a big topic in its own right!

  • Just a small design nit: I probably would take argv in Parser::parse rather than in Parser::new, and keep all of the parsing state out of Parser. (in fact it should probably be renamed something else, like Config or Options). Parser::parse could internally construct a private type that does the parsing.
    The advantage is that this would make it easier to decouple the code that configures the parser from the code that acquires the command line arguments.

5 Likes

@ExpHP,

I probably should have written a lot more about the design, but, it doesn’t really matter because the method you outlined solved my problem very nicely.

The thought to use generics hadn’t even occurred to me.

Your “Notes” section is a gold-mine of information – thank you!