"match and update": anyone know a better solution to this problem?


#1

In a Rust program I have a state and I want to update some bits of it in a loop. The pattern is probably very familiar to experienced Rust programmers, even as a beginner I’ve encountered this several times. It looks like this:

struct T1 {} // not Copy or Clone

enum PgmState {
    State1(T1),
    State2(T1),
    State3,
}

struct Pgm {
    state: PgmState,
}

impl Pgm {
    fn loop_(&mut self) {
        match self.state { PgmState::State1(ref t1) => {
                // update self.state
            },
            PgmState::State2(ref t1) => {
                // update self.state
            },
            PgmState::State3 => {
                // update self.state
            }
        }
    }
}

This obviously doesn’t work because match expression borrows self.state so I can’t update it. In an earlier version my state type was actually Copy so I could just copy self.state to stack, inspect it and update self.state. But recently I lost this trait so I need a different solution now.

To be able to update self.state without losing any information I think I have to first move it to the stack, and then copy the new form back to self.state. During this update phase there has to be something in self.state and I’m not sure what to put there.

Option 1: Use std::mem::uninitialized. Terrible solution because any panics in this code will cause the program to crash in horrible ways.

Option 2: Add a new variant to PgmState as a placeholder: enum PgmState { ..., PlaceHolder }. I don’t like this either because now I have to handle this constuctor everywhere else (just to panic!()).

Option 3: Refactor the code so that the “tag” bit is a new field and other fields in PgmState are now optional fields, something like:

struct T1 {} // not Copy or Clone

#[derive(Copy, Clone)]
enum PgmState { State1, State2, State3 }

struct Pgm {
    state: PgmState,
    /// Only avaiable when PgmState is State1 or State2
    state_data: Option<T1>,
}

This is still horrible because (1) this means more panics in the rest of the code (because of new invalid but representable states), (2) T1 is still not copy or clone so if I want to update T1 in-place this won’t work.

Anyone know any other options? I thinking I’ll go with (2) but I don’t like introducing panics in every other match on this state.


#2

One possible solution is to go the functional route, if this is sufficiently efficient for your use case:

impl Pgm {
    fn loop_(&mut self) {
        *self = match self.state {
            PgmState::State1(ref t1) =>
                Pgm { state: PgmState::State1(T1 {}), ..*self },
            PgmState::State2(ref t1) =>
                Pgm { state: PgmState::State2(T1 {}), ..*self },
            PgmState::State3 =>
                Pgm { state: PgmState::State3, ..*self },
        };
    }
}

#3

@leonardo, does that compile similar to this code? Or is there a separate Pgm created, then copied into self?

impl Pgm {
    fn loop_(&mut self) {
        self.state = match self.state {
            PgmState::State1(ref t1) => PgmState::State1(T1 {}),
            PgmState::State2(ref t1) => PgmState::State2(T1 {}),
            PgmState::State3 => PgmState::State3,
        };
    }
}

#4

I don’t know, take a look at the assembly generated. It depends on how much smart LLVM (or the compiler in general) is. If the compiler is smart enough, it will update just one field…


#5

But for that to work I need copy T1 (at least manually by reconstructing it). Am I missing anything? Also, for some reason this doesn’t work:

        PgmState::State1(_) =>
            Pgm { ..*self },

See it here.


#6

The debug targets differ by a couple instructions while the release targets have identical machine code. https://is.gd/gqKfWh https://is.gd/97OrBC

@osa1, I think T1 has to implement Clone if you want to add it to a new state object. There are no guarantees that T1 occupies the same memory region in each PgmState variant. So when you create a new PgmState, you’ll have to initialize the space it reserved for T1 with a copy/clone of the previous state’s T1. I used Clone in the examples I linked.

Update: just realized my examples may use a zero-length data-type, so I tried again with String data in T1, same result.


#7

I don’t understand why I have to copy/clone, I could just move it from its old place to new place, right? Any solutions that require copy/clone won’t work unfortunately.


#8

Use core::mem::replace(). See https://github.com/rust-unofficial/patterns/blob/master/idioms/mem-replace.md


#9

Sorry, I was thinking of @leonardo’s example where PgmState's variants hold the T1 value. In your example, T1 value is stored separately in state_data, so you don’t need to bother moving it if you leave it there.


#10
#[derive(Debug)]
struct T1 {} // not Copy or Clone

#[derive(Debug)]
enum PgmState {
    State1(T1),
    State2(T1),
    State3,
    Poison,
}

#[derive(Debug)]
struct Pgm {
    state: PgmState,
}

impl Pgm {
    fn loop_(&mut self) {
        let new_state =
            match std::mem::replace(&mut self.state, PgmState::Poison) {
                PgmState::State1(t1) => PgmState::State2(t1),
                PgmState::State2(_) => PgmState::State3,
                PgmState::State3 => PgmState::State3,
                PgmState::Poison => PgmState::Poison,
            };
        
        std::mem::replace(&mut self.state, new_state);
    }
}

fn main() {
    let mut p = Pgm { state: PgmState::State1(T1 {}) };
    println!("{:?}", &p);
    p.loop_();
    println!("{:?}", &p);
    p.loop_();
    println!("{:?}", &p);
}

#11

Simplified to eliminate the unnecessary inner PgmState type:

#[derive(Debug)]
struct T1 {} // not Copy or Clone

#[derive(Debug)]
enum Pgm {
    State1(T1),
    State2(T1),
    State3,
    Poison,
}

impl Pgm {
    fn loop_(&mut self) {
        let new_state =
            match std::mem::replace(self, Pgm::Poison) {
                Pgm::State1(t1) => Pgm::State2(t1),
                Pgm::State2(_) => Pgm::State3,
                Pgm::State3 => Pgm::State3,
                Pgm::Poison => Pgm::Poison,
            };
        
        std::mem::replace(self, new_state);
    }
}

fn main() {
    let mut p = Pgm::State1(T1 {});
    println!("{:?}", &p);
    p.loop_();
    println!("{:?}", &p);
    p.loop_();
    println!("{:?}", &p);
}

#12

I thought std::mem::replace interacts badly with Drop?


#13

It doesn’t interact with Drop at all. replace returns the previous value at the location. If that value is not adopted by some other owner before it exits scope, it’ll get dropped as normal.


#14

I missed that when I read the docs. The Rust team thought of everything :slight_smile:


#15

The situation seems bad at the moment:

https://github.com/rust-lang/rust/issues/42657