Can you use a generic enum to implement a state machine?

As the title says, I'm trying to do something along the lines of

enum SM<T> {
    State0,
    State1(T),
}

impl<T> SM<T>
where
    T: std::fmt::Debug,
{
    fn advance(&mut self) {
        match *self {
            SM::State0 => {
                let new_state = SM::State1("hello".to_string()); // This is an instance of SM<String>

                *self = new_state; // And this doesn't compile
            }
            SM::State1(label) => println!("{:?}", label), // This is OK because of the T: Debug bound
        }
    }
}

The error says clearly that the assignment expected SM<T> found enum SM<String>
So I'm a bit lost here, clearly the T: Debug bound is not enough / appropriate to make the assignment valid.
As I understand it, the RHS has a concrete type which the compiler doesn't consider compatible with the enum I'm trying to assign into?
Am I just missing some syntax here or having state transitions in a state machine which uses generics is simply not possible?

impl<T> SM<T> means that this is a generic impl for all types T. Now of course not every type is String, so you can't assign a String to a place of type T. For example, consider what would happen if you instantiated a state machine of type SM<u32>. What should the compiler do when you try to assign a String to a u32?

Did you mean impl SM<String> instead?

2 Likes

Or, alternatively, do you want to be able to store arbitrary Debug types?

enum SM {
    State0,
    State1(Box<dyn Debug>)
}

impl SM {
    fn advance(&mut self) {
        match *self {
            SM::State0 => {
                let new_state = SM::State1(Box::new("hello"));
                *self = new_state;
            }
            SM::State1(label) => println!("{:?}", label),
        }
    }
}

Or:

pub enum SM {
    State0,
    State1(Box<dyn Debug>),
    
    /// Panic occurred during processing
    Poisoned,
}

impl SM {
    pub fn advance(&mut self) {
        *self = match std::mem::replace(self, SM::Poisoned) {
            SM::Poisoned      => unreachable!(),
            SM::State0        => SM::State1(Box::new("hello")),
            SM::State1(label) => {
                println!("{:?}", label);
                SM::State1(label)
            },
        }
    }
}
2 Likes

Thanks,
so IIUC in cases where the generic is some "unrepresentable" type (like e.g. a Future), where I can only speak of such type through a trait, using a Box<dyn {trait}> would be the only way right?

No. You can make the code generic, but you have to find a way that produces a value of that generic type, not just a String. (Eg. use Default::default). Playground.

2 Likes

I see, but still in a case like the following

async fn returns_some_future() {...}

enum SM<F> {
    State0,
    State1(F),
}

impl<F> SM<F>
where
    F: Future,
{
    fn advance(&mut self) {
        match *self {
            SM::State0 => {
                
                // No way to express new_state in terms of a value of the generic type,
                // since this is a type only known to the compiler
                let new_state = SM::State1(return_some_future());
                 
                *self = new_state; // Error
            }
            SM::State1(future) => ...,
        }
    }
}

I think there's no way to keep this generic since I have no other way to speak of the type returned by returns_some_future other than through the Future trait?

It's possible, but starts to get tricky fast:

use std::future::Future;

async fn returns_some_future() { unimplemented!() }

trait AsyncThunk {
    type Output;
    type Future: Future<Output=Self::Output>;
    fn start(&self)->Self::Future;
}

impl<Fun,Fut,T> AsyncThunk for Fun where
    Fun: Fn()->Fut,
    Fut: Future<Output=T>
{
    type Output = T;
    type Future = Fut;
    fn start(&self)->Fut { self() }
}

enum SM<T:AsyncThunk> {
    State0(T),
    State1(T::Future),
}

impl<T:AsyncThunk> SM<T>
{
    fn advance(&mut self) {
        match self {
            SM::State0(f) => {
                let new_state = SM::State1(f.start());
                *self = new_state;
            }
            SM::State1(future) => unimplemented!(),
        }
    }
}

fn new_sm()->SM<impl AsyncThunk> {
    SM::State0(returns_some_future)
}
1 Like

The case of Fn* traits specifically is a bit tricky (AFAICT only because of the funky syntax that makes it look like the return type is a type parameter even though it's not). But for a regular trait, the solution would be simple using an associated type. Example.

Note that this still requires you to supply an "originating" type (the type of the function item, specifically), from which you can "project" the appropriate associated type (i.e., its return type). If you don't want this, then you don't want a generic function, either.

I guess that you have a more general, higher-level misunderstanding of the purpose of generics and trait objects. Generics are useful when you want to allow the user of the code (i.e., the caller of a generic function, or the owner of a generic-typed value) to choose type parameters.

If you want to make the compiler infer types from the implementation, and you don't want to allow the caller/owner to choose any types, then generics are useless to you, no matter how hard you twist them. A specific, concrete type is never going to be assignable to a generic type parameter (nor does it work the other way around), no matter the bounds. In this case, whatever bounds you have, whether you have async or non-async functions, whether you want to debug-print values or not, you are going to have to use a concrete type, because there's nothing for the user code to "choose".

A trait object is also a concrete type, that happens to abstract away another underlying type by using dynamic dispatch. It's useful if you want to treat multiple different types under a common trait and choose at runtime, or if you can't name the type (e.g. because it's compiler-generated, such as closures or async fn return values).

3 Likes

Yes, I definitely have a lot of misunderstandings, and my C++ background is kind of making things worse when it comes to generics given the completely different handling of type parameters in C++ templates..

For this specific case though, I think I wanted a generic since I was trying to better understand how futures work by manually implementing one through a state machine that would call an inner future coming from an async fn.

Now, I think I get the gist of both your and 2e71828's example (I wish I could mark both posts as solutions) but they're still a bit over my head and I think I have lot of food for thought to digest here...

No, they are not that different. The equivalent with C++ templates doesn't compile, either, if you supply the wrong types. It's just that C++ templates are type checked only after instantiation, whereas Rust generics are typechecked before instantiation. But you can't do what you seem to want with C++ templates, either:

// Online C++ compiler to run C++ program online
#include <iostream>
#include <string>

template<typename T>
struct SM {
    T state;
};

template<typename T>
void advance(SM<T> *sm) {
    sm->state = std::string("hello");
}

int main() {
    SM<int> sm; // if this is not <string>, it won't compile
    advance(&sm);
    std::cout << sm.state << std::endl;
    return 0;
}

Okay, but then you can't make the future's type come from the body of advance(), you have to make it depend on the type of the async fn to be called, and supply the function or the wrapped future itself externally.

IOW you seem to want exactly my second example. Perhaps with the function to be called coming from somewhere else instead of the argument of advance() (example). But where it comes from is a technical detail at that point, the essence is that for a generic to make sense in this context, it has to come from the outside and not from inside the method body.

2 Likes

It's also possible that @abigagli is looking for some way to hold a specific unnameable type, an internal async block which is just an implementation detail. For that case, dyn Future is the best solution available until #![feature(type_alias_impl_trait)] is ready.

2 Likes

^^^ THIS! That's exactly what I was trying to do: the async fn being called was an internal detail of the SM.

Also, the following

now confirms my understanding of what made both of your and @2e71828 code examples work: i.e. providing the type of the generic from the outside, either as an argument to the advance method or to the tuple-struct constructor when kicking off the SM.

What I don't fully get is the formal explanation for why is that, I assume that seeing the concrete type coming from the outside, the compiler is able to infer the concrete type for the state machine such that it matches the RHS side of the *self = new_state assignment, but I'm not sure that is the correct way to explain it...

Yes, that's exactly right. To use a generic parameter, you need to:

  1. Fully define the interface that the type parameter is supposed to provide, and
  2. Write the state machine so that it works with any type that provides that interface.

With that done, it's then possible to instantiate it from the outside with the particular type you care about.

In the case of my code, I got around naming the type explicitly by using impl Trait syntax in the constructor's signature: All downstream user code needs to know is that the type used fulfils the interface that the state machine is expecting, and therefore advance() is legal to call.

Again, a generic type parameter (F in your example) may stand in for any type. Hence, if you are trying to force a concrete type (e.g. String or a future) in a place that has the type T, it won't work, because the compiler must be conservative, as you might instantiate the generic type (SM<F>) with different choices of F (i.e., not only String) which don't match the type in the function body.

If you "inject" the type from the outside, then two things happen:

  1. You are forced to write the code in a way that wherever you want to assign to a place of type F, you already have a value of type F. This is the essence. The compiler can allow let x: F = other_value_of_type_F, because then the types are guaranteed to match, regardless of the particular choice of F (this is higher-order reasoning, namely universal quantification: F is the same as itself, for every choice of F. But F may not be the same as String for every choice of F.)
  2. When instantiating your state machine, the concrete type(s) will be inferred by the compiler. Of course, instantiating a SM<F> is possible for all Fs (that satisfy the given trait bounds); unless you want to name the concrete type SM<_> itself, you don't need the type substituted for F to be nameable (because it will be inferred just fine)! This makes it work with async fn-produced Futures, closures, whatever. But there is no magic: if the type is nameable, e.g. String, then you can still be explicit and instantiate e.g. a SM<String> or a SM<Box<dyn Future>> or a SM<MyOtherType>.

The really interesting part is not the compiler inferring some type; that's only needed when the type is unnameable. The interesting part is that you had to rewrite the function body in the first place, so that it allowed the higher-order reasoning by assiging Fs to Fs instead of trying to force a String or a Future in a type-variable-shaped hole.

2 Likes

Thanks for the detailed explanation, I get the overall logic and it makes sense.

What I think I'm missing is just this little piece: you say

If you "inject" the type from the outside [...] The compiler can allow let x: F = other_value_of_type_F , because then the types are guaranteed to match, regardless of the particular choice of F

this is clear, but in the specific case of the advance method

fn advance(&mut self, f: F) {
        match *self {
            SM::State0 => {
                let new_state = SM::State1(f.returns_some_future());
                 
                *self = new_state; // Error
            }
            SM::State1(_) => {}
        }
    }

in this binding

let new_state = SM::State1(other_value_projected_from_F)

It seems like for the compiler it's enough that the argument to SM::State1 is somehow related / projected from F for new_state to be an SM<F> as well, and hence it's OK to perform the
*self = new_state assignment?

Maybe I'm nitpicking, but I want to be sure I don't build wrong models in my head..

It's not the "related to F" part that matters. I'm not sure whether it's just a language barrier, or you have some deeper misunderstanding. The above example compiles, because SM<F>'s State1 variant is defined to contain F::Output. That's the same type as the return value of F::returns_some_future(), for all possible types F.

There's no magical "connectedness" involved. It's simply that the types have to match up. If the variant must contain a type F::Output, then an expression of the exact same type can be put into it, and nothing else.

The point is that in a generic context, the concrete types are not known, so in order to typecheck the body successfully, the compiler has to reason using the aforementioned universal quantification: a type variable (or an associated type thereof) can thus only ever be considered equal to itself, and not to some concrete type. So writing concrete types won't work, but writing a type variable where the same type variable is expected will work:

  • "For all types F, F::Output is the same as F::Output" – this statement is is always true.
  • "For all types F, F::Output is the same as String" – this statement is not always true, hence type checking fails.

Another way to put it is that a kind of symbolic reasoning is required to typecheck generics. What matters is the declared relationships between types. This is exactly the same kind of problem as the following trivial algebraic question:

  • "For all natural numbers N, N + 1 == N + 1". This is always true, regardless of the particular choice of N.
  • "For all natural numbers N, N + 1 == 137". This is not true for all Ns. Even if you can pick a particular N that satisfies this assertion, the "for all natural numbers" part makes it false in general, and as soon as you start sticking in concrete constants into such an equation, you'll loose all chance of making it true in general, so you have to continue using the variables and their general
    properties, instead of inspecting specific numbers.
2 Likes

Most probably a misunderstanding...

I don't know why I couldn't see it before but I think this should be it:

  1. The externally provided argument f: F gives us a way to produce an F::Output calling f.returns_some_future
  2. If I have an F::Output I can call the SM::State1 tuple constructor
  3. The tuple constructor SM::State1 will return an SM<F> that I can assign into *state

Thanks for your patience and help.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.