Exploring promoting errors to compile-time

Here is an experiment, expressed in C++, in trying to promote errors to compile time:

#include <iostream>

// ----- A class with two mandatory configurables ------------------------------
class ready {
  ready(int f, int b) : f{f}, b{b} {}
  int f;
  int b;
public:
  void use() { std::cout << f << ' ' << b << std::endl; }

  // ----- incompletely-specified states ---------------------------------------
private:
  struct needs_foo {
    needs_foo(int b) : b{b} {}
    ready foo(int f) { return {f, b}; }
  private:
    int b;
  };

  struct needs_bar {
    needs_bar(int f) : f{f} {}
    ready bar(int b) { return {f, b}; }
  private:
    int f;
  };
  // ----- entry point to the construction interface ----------------------------
public:
  struct make {
    needs_bar foo(int f) { return {f}; }
    needs_foo bar(int b) { return {b}; }
  };
};

// ----- Client code that must call foo and bar exactly once each, before use ---
int main() {
  ready::make()
    .foo(1)
    .bar(2)
    .use();
}

The idea is that, rather than catching these problems with runtime checks, client code should fail to compile if it attempts to

  • call .use() without having called .foo() and .bar() beforehand,
  • call .foo() or .bar() more than once.

This is OK as a proof of concept, but there is an obvious problem: 2^N classes are required, where N is the number of configurables.

Can this O(2^N) cost be reduced to O(N)(-ish) in Rust? I was thinking along the lines of having one trait per configurable and implementing some constraint-arithmetic using these, but (especially after a few months away from Rust) I'm coming up short.

Any ideas? Is there some completely different approach to getting such early errors, that is more effective in Rust?

The use function could take ownership of a argument non-copy & non-clone token which is produced by the foo function.

mod a {
    #[derive(Debug)]
    pub struct Token(());

    pub fn foo() -> Token {
        Token(())
    }

    pub fn use_token(token: Token) {}
}

fn main() {
    let token = a::foo();
    a::use_token(token);
}

Since that token cannot be produced by anything other than foo, cannot be copied or cloned and cannot be used anywhere else than in use_token, the ordering is guaranteed.

I'm sorry, I don't understand how your suggestion is related what I'm looking for. Perhaps I should be more explicit.

In order to use my Thing it has to be given some meaningful value (meaningful is context-dependent, so it's difficult to talk about in this completely abstract example; but in my use cases these tend to be strategies for getting some task done) of two (in this minimal example; in general, N) things:

  1. foo
  2. bar

It doesn't matter in which order they are specified, as long as both are specified

  • exactly once
  • before .use is called.

Here is some sample client code which shows two paths that should work, and four that should give rise to compile-time errors.

At each stage you should imagine that foo/bar/use receive arguments which say something meaningful to the user.

let not_set = Thing::new()

let foo_set = a.foo(<some meaningful value>);
let bar_set = b.bar(<something that matters in a real context>);

// foo_set.foo(...); // compile-time error: setting foo more than once
// bar_set.bar(...); // compile-time error: setting bar more than once
// not_set.use(...); // compile-time error: neither foo nor bar set
// foo_set.use(...); // compile-time error: bar not set before use
// bar_set.use(...); // compile-time error: foo not set before use

let all_set_a = foo_set.bar(...); // foo already set, now setting bar
let all_set_b = bar_set.foo(...); // bar already set, now setting foo

all_set_a.use(...); // Can use because both foo and bar have been set
all_set_b.use(...); // Same again, though setting was done in reverse order

// Now that all_set_a/b have been fully configured, they can be used an arbitrary number of times
all_set_a.use(different_value)
all_set_b.use(another_value)

More succinctly, but perhaps more cryptically:

// Compile-time errors
Thing.new().use()       // foo and bar missing
Thing.new().foo().use() // bar missing
Thing.new().bar().use() // foo missing
Thing.new().foo().foo() // foo conflict
Thing.new().bar().bar() // bar conflict

// Success
Thing.new().foo().bar().use() // both foo and bar set
Thing.new().bar().foo().use() // both foo and bar set

The way I interpret what you wrote is that it allows to impose the order in which two functions are called, by passing some meaningless-to-the-user token. If that addresses my problem, then I'm not seeing how it does so.

My aim is to promote errors which inform the user that something has been forgotten, from run time to compile time, without obliging the user to go through more ceremony than before.

Do you mean the typestate pattern? That does what you want to do, if I'm understanding you correctly.

2 Likes

I read constraint-arithmetic as constant-arithmetic and came up with this (on unstable).

Here's a version that works on stable using a linear number of type parameters.

The errors aren't great for either.

What you are describing is a state machine, and state machines are notoriously impractical whenever the number of nontrivial state transitions nears saturation for a given number of states. If you have too many states such that the 2^N becomes a problem, the natural thing to do is implement some kind of interpreter that accepts transition payloads and executes them, rather than trying to model the states individually and handling the transitions separately. You can model the state of the interpreter at compile time as quinedot has, but that has limitations that depend on how complex the interpreter needs to be based on the shape of the transition criteria.

But there isn't any way to simplify without knowing more about the problem, such as what form the transition criteria must take, (which vary from problem to problem and thus probably wouldn't be simplified by features of a general purpose language.)

Yes, that's exactly what I mean. I didn't know it went by this name, so thank you. The article is interesting, and pointed out some Rusty aspects that were worthwhile (so thank you, once more), but it doesn't address my problem, which is how to avoid the exponential complexity.

Thank you, that's exactly the sort of thing I had in mind. (The unstable version is a bit more than I have time to digest properly, at the moment.)

The verbosity of the data state updates is pretty painful and O(N), but that can be mitigated with something like

struct Thing {
   state: Box<ActualData>,
   params: PhantomData<(...)>,
}

And, as you say, the error messages leave something to be desired. The only idea I have so far is to give explicit names to distinct parameter types corresponding to each of the components, so that the errors end up looking something like

method not found in `Thing<FooMissing, BarAlreadySet>`

This would raise the implementation cost of the parameter types from O(1) to O(N), which is annoying but not prohibitive.

The whole scheme does reduce the implementation cost of the methods from O(N^2) to O(N), which is what I was looking for. Thanks!

Interesting, I tend to see interpreters everywhere, but I wouldn't spontaneously have called this an interpreter. But yes, I see what you mean.

FWIW, once we get const generic strings (or use one of the stable-rust polyfills), we can get quite nice error messages.

Tweaking a bit @quinedot's stable example (but using nightly for the const generic strings), we can get errors such as:

error[E0277]: the trait bound `AlreadySet<"foo">: __::Unset` is not satisfied
  --> src/main.rs:96:27
   |
96 |     let builder = builder.set_foo("".to_string());
   |                           ^^^^^^^ the trait `__::Unset` is not implemented for `AlreadySet<"foo">`

error[E0277]: the trait bound `MissingParameter<"bar">: __::Present` is not satisfied
  --> src/main.rs:97:28
   |
97 |     println!("{}", builder.finish());
   |                            ^^^^^^ the trait `__::Present` is not implemented for `MissingParameter<"bar">`
2 Likes