Generating constants and optimized code at compile-time

I'm using Rust on a bare-metal microcontroller and am having trouble generating optimal code at compile-time.
I'll explain a simplified toy problem and the shortcomings of the approaches I've been able to come up with.
I'm curious to hear what y'all think.

Toy problem

I have 4 sensors (a--d), each connected to 1 pin of one of two GPIO "ports" (E or F):

Sensor a on E3
Sensor b on E5
Sensor c on F1
Sensor d on F0

(The sensors are connected to pins based on PCB layout constraints, which is why they seem arbitrary from the software side.)

A pin is set as input or output according to a bit in a gpio register.
Since I want my sensor pins to be inputs and the other pins to be outputs, I need to write:

device.gpioe.direction = 0b0010_1000;
device.gpiof.direction = 0b0000_0011;

(I'm pretending I'm on an 8-bit machine here, but I'm actually using 32-bit microcontrollers.)

Once configured as inputs, I can read the current sensor states by reading the ports:

device.gpioe.read(); //returns bits 0bxxbx_axxx
device.gpiof.read(); //returns bits 0bxxxx_xxdc

(where x is a bit I don't care about.)

I need to combine these two words into a single output word: 0bxxxx_dcba.

So, to be clear, at compile time I know the sensor -> port+pin mapping and would like to generate at compile time:

  1. the direction words for each port
  2. an optimal (e.g., smallest code size or fastest runtime) program that maps the two words read from the ports into a single "output word" with the sensor state bits in the right positions.

Solution requirements

  1. the hardware configuration (sensor -> port+pin mapping) is defined in exactly one place
  2. that configuration is swappable (with cargo feature flags or similar) so I can easily generate multiple hardware-specific binaries (one board might have 6 sensors on 2 ports, one might have 14 on 3 ports, etc.)

Const fn?

This would be nice:

struct Sensor(char);
enum Pin {
    PortE(u8),
    PortF(u8),
}
struct Assignment {
    sensor: Sensor,
    pin: Pin,
}

const ASSIGNMENTS: [Assignment] = [
    Assignment {
        sensor: Sensor('a'),
        pin: Pin::PortE(3),
    },
    Assignment {
        sensor: Sensor('b'),
        pin: Pin::PortE(5),
    },
    Assignment {
        sensor: Sensor('c'),
        pin: Pin::PortF(1),
    },
    Assignment {
        sensor: Sensor('d'),
        pin: Pin::PortF(0),
    },
];

const fn port_e_direction_word() {
    let mut word = 0;
    for a in ASSIGNMENTS {
        if let Assignment {
            pin: Pin::PortE(idx),
            ..
        } = a
        {
            word |= (1 << idx);
        }
    }
}

but unfortunately:

error[E0744]: `for` is not allowed in a `const fn`
error[E0658]: `if` is not allowed in a `const fn`

Macros?

I've only used macro_rules! for basic syntactic transformations, but my understanding is that rust macros only operate syntactically.
If so, I don't see how they can help with my situation.

If the assignments are declared as above (a const value), then macros can't them as values, iterate over them, etc.

An alternative could be to encode the assignments within the macro call sites:

macro_rules! generate_direction_word {
  ( $( $idx:expr ),* ) => {
    {
      let mut word = 0;
      $(
         word |= (1 << $idx);
       )*
      word
    }
  }
}

pub fn setup_ports() {
    device.gpioe.direction = generate_direction_word!(3, 5);
    device.gpiof.direction = generate_direction_word!(0, 1);
}

but I see at least two problems with this approach.

First, this impl won't meet the requirement to have the configuration in exactly one place.
(We can't invoke the generate_direction_word with literals in our source code, because that means our ports_to_output_word macro won't have them.)

Can we invoke a single macro

define_board_stuff!( (a, gpioe, 3), (b, gpioe, 5), (c, gpiof, 0), (d, gpioe, 1) );

and have that invoke generate_direction_word! for each port AND define the ports_to_output_word function?
Maybe it's possible, but it's not obvious to me how to write it.

The second problem I see with this macro approach is that we are relying on LLVM to notice that the work can be done at compile time.
The our first macro invocation will expand to something like:

device.gpioe.direction = {
  let mut word = 0;
  word |= (1 << 3);
  word |= (1 << 5);
  word
}

which I think LLVM will optimize at compile time into the desired

device.gpioe.direction = 0b0010_1000;

but I'm not aware of any guarantees about this kind of thing.

So even if it is possible to do all this in macros, I worry that:

  • it'll be very complex to reason about and understand later
  • there's a hidden performance cliff where a seemingly minor change in the macro syntax soup or configuration difference (33 sensors on 5 ports?) means we're now accidentally generating code that does everything at runtime.

build.rs

Using build.rs is my top candidate at the moment since it's designed to run arbitrary programs at compile time.
I can write whatever I want --- in Rust or another language --- to read hardware configuration data (straight from the PCB netlist if I want!), do whatever iteration and data processing necessary to generate constants.
For the problem of generating the optimal ports_to_output_word function, I could call out to an SMT solver to do program synthesis / superoptimization.

The big question I have about this approach is how to keep it ergonomic.

Should I just pepper my (run-time) code with things like include!("def_gpio_direction_consts.rs");?
Will it get weird having the run-time half of my code in a separate place than the compile-time half?
Would it be better to try and write compile-time code within specially formatted comment-blocks (like some tests/examples) so that domain-specific concerns are grouped together in the source code?

Are there any projects (Rust or otherwise) that I should look at for inspiration?

Ifs and loops in constexprs have been stabilized and are scheduled for the 1.46 release; you can use them on nightly now without any feature flags. I don’t think for loops work, though, due to their reliance on iterators— you’ll need to use while or loop and manually update your indexing variable.

The macro approach can also work; I’d approach it along these lines:


macro_rules! def_port_cfg {
    ( $( $port:ident { $( $sensor:ident ( $pin:tt => $out:tt )),*})*) =>
    {
        mod port_cfg {
            $(
                 pub mod $port {
                     pub const DIR:u32 = 0 $( |(1 << $pin))*;
                     #[inline]
                     pub fn pack(val:u8)->u32 {
                         0 $(| ((((val >> $pin) & 1) as u32 )<< $out)) *
                     }
                 }
            )*
            pub fn pack($($port: u8),*)->u32 {
                0 $(|$port::pack($port))*
            }
            pub mod unpack {
                $($(
                    pub fn $sensor(packed:u32)->bool {
                        1 == (packed >> $out) & 1
                    }
                )*)*
            }
        }
    }
}

def_port_cfg! {
    port_e { a(3 => 0), b(5 => 1) }
    port_f { c(1 => 2), d(0 => 3) }
}

(Playground — Use Tools -> Expand Macros)

1 Like

This is super helpful, thanks @2e71828!

To make sure I'm following correctly, a few comments:

  1. The DIR calculation is so simple that (presumably) LLVM will always optimize it into a constant at compile-time.
    I wasn't able to find any guarantees around this, though --- isn't it possible that this macro will accidentally generate a bunch of code that makes it to runtime?

  2. The packing works by packing each port separately then ORing them together.
    Because the port-specific packing fns are annotated with #[inline], the compiler should move their code into the general pack function and thus be able to optimize pretty well.

Because it’s a const declaration, the value is guaranteed to be calculated at compile time and stored in the binary— either in a statically initialized memory location or as an immediate value at each use.

Right; #[inline] is likely not necessary here, since the annotated functions are probably simple enough to get inlined by rustc anyway. You could put it all into a monolithic pack function as well, but that prevents you from reading and decoding a single port if the situation calls for it.

You can also put the port-reading call inside the pack functions instead of sending the value in as a parameter if you prefer.


After writing this code, I had the thought that you might want the macro to generate a newtype for checking results instead of free functions; the code generated by the compiler will be effectively identical, but it’s a little easier to work with. You’ll also get a compile error if you try to read a sensor value out of a u32 that didn’t come from reading the GPIO ports:

#[repr(transparent)]
#[derive(Copy,Clone,Eq,PartialEq)]
pub struct Sensors(pub u32);

impl Sensors {
    pub fn read() -> Self { /* ... */ }
    pub fn pack(port1: u8, port2: u8, /* ... */ ) -> Self { /* ... */ }
    pub fn a(self) -> bool { /* ... */ }
    /* ... */
}
1 Like

This looks awesome, thanks for taking the time to write the code and explain everything!
I didn't know about that const guarantee nor the repr(transparent) attribute.
If you are ever in Taipei drop me a line so I can buy you a beverage = )

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.