Self-referential functions

is it possible to declare self-referential function as golang does below?

// golang syntax
type option func(f *Foo) option

// run lexes the input by executing state functions until
// the state is nil.
func (l *Lexer) Run() {
	for state := lexText; state != nil; {
		state = state(l)
	}
	close(l.items) // No more tokens will be delivered.
}

// Verbosity sets Foo's verbosity level to v.
func Verbosity(v int) option {
    return func(f *Foo) option {
        previous := f.verbosity
        f.verbosity = v
        return Verbosity(previous)
    }
}

source

This kind of self-referential function can also be used to implement state machine and its transition logic:

type stateFn func(*Lexer) stateFn

func lexText(l *Lexer) stateFn {
	for {
		if strings.HasPrefix(l.input[l.pos:], leftMeta) {
			if l.pos > l.start {
				l.Emit(itemText)
			}
			return lexLeftMeta // Next state.
		}
		if l.Next() == EOFRune {
			break
		}
	}
	// Correctly reached EOF.
	if l.pos > l.start {
		l.Emit(itemText)
	}
	l.Emit(itemEOF) // Useful to make EOF a token.
	return nil
}

Check this talk from Rob Pike.

The idea is to return a function object of type StateFn, and it knows what to do in the current state and how to do state transition.

Besides, any suggestions on how to implement state machines in Rust?

Yes... but it's not particularly idiomatic, in part because it's not the most performant to implement. Rust chooses to make these costs explicit, hence a bit more verbose.

The more idiomatic way to implement would be to mutate the Fn object instead, which I'll show momentarily.
But here's the first example mirroring yours:

// we have to declare a newtype instead of a type alias
// since Rust doesn't allow recursive types
pub struct Foo { verbosity: i32 }
pub struct StateFn(Box<dyn FnOnce(&mut Foo) -> Option<StateFn>>);

pub fn verbosity(v: i32) -> StateFn {
    StateFn(Box::new(|foo| {
        let previous = foo.verbosity;
        foo.verbosity = v;
        return Some(verbosity(previous));
    }))
}

Here we can have this type but there's a few new things present in the Rust implementation compared to the Go one:

  • We have to use dyn FnOnce instead of fn or Fn or similar. This is an indicator that we are using dynamic dispatch (implicit in the Go implementation) which has extra costs
  • We specify FnOnce. There are three function traits in Rust, which we won't get into. These specify how we can call the function.
  • We have to use Box in the type and Box::new in the implementation. This indicates that we are adding an extra level of indirection and putting the function on the heap
  • We add option, as Rust types are not nillable by default. In this case, the result is exactly the same, the null pointer corresponds to None.

Instead, it's more idiomatic to use one type for the state function, and mutate that function to change states. This will generally run significantly faster than the method returning function objects, and the Rust type-system helps avoid misuse by requiring uniqueness to mutate it.

We will convert it to an enum manually, which will use a compact tagged representation instead of function pointers. In real code, a macro library will do this for you (there are several state machine libraries but alas I don't know which to recommend)

// we have to declare a newtype instead of a type alias
// since Rust doesn't allow recursive types
pub struct Foo { verbosity: i32 }
pub enum StateFn {
    Verbosity(i32),
}
impl StateFn {
    // returns true if there is more work
    pub fn step(&mut self, foo: &mut Foo) -> bool {
        use StateFn::*;
        match self {
            Verbosity(v) => {
                // note this implementation can be simpler
                // std::mem::swap(&mut foo.verbosity, v)
                let previous = foo.verbosity;
                foo.verbosity = *v;
                *self = Verbosity(previous);
                true
            }
        }
    }
}

Of course in this simple case our state is only an integer, so we don't need the enum at all and could just do this:

pub fn verbosity(mut v: i32) -> impl FnMut(&mut Foo) {
  move |foo| {
     // likewise can use std::mem::swap
     let previous = foo.verbosity;
     foo.verbosity = v;
     v = previous;
  }
}
7 Likes

Thanks for the awesome reply! It works for me except

This Option<StateFn> reminds me of

struct Node {
   next: Box<Node>
}

instead, tuple struct is used here. Very nice! Thanks!

Yeah, but this enum solution will lead to a large match block (switch in go) if you have many states. And it's hard to expand it if a client wants to add more states.

The reason for using this stateless function object is to

  1. avoid a large switch block or match block.
  2. easy to expand since driver code only needs to call state handle function repetitively.

Definitely, I need to check if there are good state machine libs! If you have any suggestions, please let me know.

Thanks for the awesome answer!

Thanks for pointing me the direction, here's rust impl (playground). You can find go impl in gist. Since go has gc, its closure is more powerful to capture state.

struct State {
    val: i32,
    end: i32,
}

pub struct StateFn(Box<dyn FnOnce(&mut State) -> Option<StateFn>>);

fn add_one() -> StateFn {
    StateFn(Box::new(|state| {
        println!("{}", state.val);

        if state.val >= state.end {
            return None;
        }

        state.val += 1;
        Some(add_one())
    }))
}

fn main() {
    let mut state = State { val: 0, end: 10 };

    let mut state_fn = add_one();
    while let Some(next_state_fn) = state_fn.0(&mut state) {
        state_fn = next_state_fn
    }
}

I have to admit, rust is much harder than golang.

That's on the right track. And right, Rust gives a lot more control over the memory of your program, so as a result, things which are implicit or unsaid in Go (e.g. where something is allocated, who owns it) have to be made explicit. There are a few things in Go which are actually equivalent to the Rust version, but which are hidden. E.g. dyn Traits actually correspond almost perfectly to Go's interface types. But Go lets you use func for the interface object, whereas Rust distinguishes fn types with no closure, and dyn Fn... traits that can own/access data.

Since go has gc, its closure is more powerful to capture state.

So I didn't want to overload with too much at once, but you'll notice in my last block I used a lambda (the expression like |arg1| e, but with the move keyword in front of it). This allows you to move arguments into the lambda, which was probably what you wanted. Otherwise it will try to borrow the arguments by default. The ownership of the data is as flexible as you want, but as mentioned above Rust very much asks you to put in the effort to pick.

2 Likes

Got it! Thanks!