Code generation for microcontrollers

I have in done in the past code generation for microcontrollers using the following approach:

  1. Write template files for an algorithm. These templates are basically C code but contain keywords that will be replaced in step 2 (e.g., data-dependent parameters)
  2. For a specific dataset, generate C code for a tailored algorithm. Using Python, I generate values for the keywords from the dataset, and replace them in the template files. The dataset is also part of the generated code (say, I generate a .c file containing a C array from a numpy array).
  3. Cross-compile/link the generated C code, and deploy into an MCU

I would like to generate no_std Rust code (instead of C) ideally using Rust (instead of Python). The generated code are algorithms (mostly multiplication of small dense matrices, for which I could use ndarray/nalgebra, or roll my own implementation), with some data specific parts. The approach I have follow so far would work using Rust as well. My question is:
Is there a better way? Should I invest time learning Rust macros or other tools (does that make sense in my particular case)?

My questions is similar to this one (it's from 3 years ago, having an updated perspective would be nice), but also more specific

Rust macros are a convenient way to, from within a Rust program, ask for code generation incorporated into that program. If you have a tool that is generating the entire program, then there is much less reason to use macros.

However, some of the tools used for Rust macros may be useful for your code generation. For example, quote can build up programs from fragments, while ensuring that there are no lexical errors of the sort string concatenation can produce.

It also might be possible to fit your entire code generator into a macro — to write a macro so that src/main.rs can be just

generator::generate!("parameters.csv");

but that approach should be chosen only if you want the user-interface to your generation tool to be that. (Another place to put code generation is in the build script.)


I would also recommend that you do not take a template source file and substitute values into it, but instead keep things modular:

  • The generator generates from scratch the absolute simplest possible code for items (fn, const, macro_rules) that are dependent on the dataset.

  • The non-data-dependent code is in a published library crate.

  • The generated package's src/main.rs is a minimal stub that puts those together,

    mod generated;  // src/generated.rs was produced by the generator
    fn main() {
        library::main(generated::SomeType);
    }
    

    where SomeType is a generated type that implements a trait whose functions are the data-specific numerical functions your generator generated. (Or in other kinds of problems, the thing to pass to library::main might be a static variable containing a data table.)

This way, as much of the code as possible is normal Rust code in a library, which can be maintained using all of the usual tooling, and problems which involve tinkering with the generator and rerunning it are kept to a minimum.

3 Likes

Awesome, thanks. What you recommend makes a lot of sense. A specific question I still have is regarding the first point

The generator generates from scratch the absolute simplest possible code for items (fn, const, macro_rules) that are dependent on the dataset.

Just for the sake of my understanding, consider how I used to do things in the C/Python scenario. To keep the compiled-code size as small as possible, I basically generated code without branches (e.g., noif-else) using conditional compilation. In C, it was relatively straightforward using #ifdef. That is, depending on the special structure of data set, I could generate a few #defines, and this would compile only the code relevant to handle the data set in question.
In Rust, this could be roughly replicated using features, right? That is, instead of the 3 points you suggested, the process would be something like this

  1. The generator generates from scratch only static data plus some minimum code, and a Cargo.toml
  2. I publish a crate that contains the full code to handle all scenarios. However, this crate relies heavily on features for conditional compilation.
  3. The generated package's Cargo.toml from 1. only uses the features of the published crate from 2. that are relevant for the data from 1.
  4. The generated package's src/main.rs is a minimal stub that puts 1 and 2 together

what would be the disadvantages of doing it this way?

It may be worth at least investigating the ease and output quality you get by using Rust traits and generics, since they seem to logically match up exactly to replacing parts of of a template and it's a bit nicer to actually edit. They get talked about mostly with being able to swap out methods, but you can also use associated constants.

If you need to generate and use fancy long byte arrays for those constants, you can use include_bytes! to include data on disk, or even write your own procedural macro to instead generate the values on the fly.

1 Like

Just for the sake of my understanding, consider how I used to do things in the C/Python scenario. To keep the compiled-code size as small as possible, I basically generated code without branches (e.g., noif-else ) using conditional compilation. In C, it was relatively straightforward using #ifdef . That is, depending on the special structure of data set, I could generate a few #define s, and this would compile only the code relevant to handle the data set in question.
In Rust, this could be roughly replicated using features, right?

Yes, you could use features to omit code.

However, before reaching for any kind of conditional compilation — whether it is features, macros, or code generation — to do this, you should try not worrying about it. Idiomatic Rust optimization is heavily focused on letting the compiler perform inlining and optimizing the code after inlining — so a perfectly good way to express “in this case, do nothing” is to define a function (such as one of the trait functions I proposed in my previous post) whose body is empty, and let the compiler inline that function call and turn it into zero actual machine instructions. Similarly, a type which has no fields occupies no memory and costs nothing to create.

what would be the disadvantages of doing it this way?

It is very easy to, when editing a library that uses features, make a change which causes it to not compile under some particular combination of features, and not notice. If you had heavy use of features, you would want to write a test script which builds and tests the library under many combinations of features.

If you instead write ordinary generic code, the compiler can and will check that it is valid for all possible instantiations (though of course, that does not prevent there possibly being a bug that is detectable at run time only).

1 Like

I am unfortunately not yet proficient in writing code using traits/generics. But in principle, in my particular case, it could be possible to write most of my code using traits and generic types. The idea would be as follows:

  1. I have implemented my algorithm fully, and it relies on traits/generic to consider all special cases related to the data. This is a library crate.
  2. Depending on the data, my code generator outputs code that uses a specific type.
  3. The compiler does the rest, i.e., it generates only the necessary code of the algorithm/library (1.) for the particular data type in the generated code (2.)

It'll be something conceptually similar to this minimal implementation in the playground.

Am I on the right track?

To be explicit about my suggestion, you could also use something like:

trait Data {
  const DATA: &'static [u8],
}

impl Data for i32 {
  const DATA: &'static [u8] = include_bytes!("i32-data-bytes");
}

fn compute<T: Data>() {
  do_something_with(T::DATA)
}

but the compute() would be inlined if reasonable too.

1 Like

Ok, I think I get what you mean. That’s quite awesome! I guess I just have to experiment a little bit using some simple scenarios and see how far I can get.

Now that you mentioned “associated” constant in traits, is these somewhat related to GATs? I don’t want to derail the thread, but will understanding GATs help me write “better” code (I.e., easier to maintain, more idiomatic, etc.) in my specific case of code generation? If so, what could be a good example in this particular case?

They're related, associated items include types, and GAT's are generic associated types, e.g. Foo::Bar<T> in:

trait Foo {
  type Bar<T>;
}

although the examples I have usually seen they are used for lifetime parameters, e.g.

trait BorrowIterator {
  type Item<'a>;

  fn next<'a>(&'a mut self) -> Self::Item<'a>;
}

is an iterator where the item type can borrow references from the iterator itself, unlike the current Iterator trait, where the Item associated type can't get at the lifetime of the self reference.

GATs are new enough (on stable) that I've not yet used them really at all, so I don't have a good feel on where you can use them other than those examples, but certainly I've heard they can be used for some really fancy stuff.

1 Like

That isn't quite what I meant. I was suggesting your code generator could generate a type (possibly one carrying no data itself) and accompanying trait implementation (whose functions can have whatever bodies the generator wants).

However, that's certainly not something that is necessary. And if you can solve your original problem with only “canned” types and impls from the library (as in your playground example), you should. I'm saying that if you want to generate arbitrary code, then a good way to do that is to put it in a trait impl so that it can be used by generic library code, which can be statically checked independent of the generated code.

Thanks a lot for your input and time. It really enlightened me. I will try to implement something simple using your suggestions. I might come back to this thread in a few weeks in case I have more questions.
Cheers!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.