Representing a data structure that has optional components

Sorry if the title is unclear, this is a complicated question.
I'm creating a chess library in Rust (for fun). I want to represent a move as a type, but I want to be able to represent different types of moves (just the origin and target square, normal algebraic notation (only origin piece if possible), reversible notation, etc.). I have come up with three ways of doing so, all not ideal:

  1. Make a single type with a constant amount of information.
  • This is not ideal, since I would not be able to represent and convert moves with less/more information without providing a board to fill in the missing data.
  1. Make a single type with Options for all optional fields.
  • This is not ideal, since there's no way to know (AFAIK) at compile-time if a conversion between two move types is possible without providing a board to fill in the missing data.
  1. Make a type for every possible move type
  • This works and satisfies my requirements, but takes a lot of work and would require a really complex network of Into/Froms

Is there a solution to this problem I didn't think of, or is this really the only solution? Are my requirements too unrealistic? Do I need to look at this problem from a different angle?
Thanks in advance!

I don't really get the problem. If some parts of your type are optional, wrap them in an Option. If they are obligatory, then don't. And if you have totally different representation, then those should be separate types.

Giving more details, specific examples, and concrete problems you encountered would make your question usefully answerable.

Isn't it in fact the case that you need the board to know if an arbitrary move is lacking, sufficient, or over-specified?

Sorry, what do you mean?

For example: a standard algebraic notation move will look something like this:

struct AlgebraicMove {
    origin_piece: PieceType,
    origin_rank: Option<u8>,
    origin_row: Option<u8>,
    capture: bool,
    target_square: Square,
}

If I have a move that's structured like this:

struct MinimalMove {
    origin_rank: u8,
    origin_row: u8,
    target_square: Square,
}

MinimalMove -> AlgebraicMove will need the board to properly convert it, because there's not enough information. There are also cases where this doesn't need the board, and in that case I want to know that at compile-time if possible. This isn't possible when using a Options for every optional field (like the origin_piece), because you can't know at compile-time if that optional field is filled.

Just going off of a skim of the definition of the chess notations you mentioned, but how can you convert from Rxd6 to Rd2xBd6 if you don't know where the rook was located and what pieces was on d6, for example? Or know that Rxd6 is unambiguous, in the opposite direction.

This will be done by providing a board state that contains that information into the conversion function. This is useful if I have a sequence of moves from the starting position, and want to enrich those moves to that other format. I can do that by simulating a game, and converting every move using the board state at that point in the game, and then collecting that into another sequence of moves.

This is defined here: Algebraic notation (chess) - Wikipedia

If you recognize the board is necessary, I don't understand why you consider needing a board non-ideal in points 1 and 2.


Maybe you what you want is some sort of builder pattern, where you read in moves with a variable amount of information (Options) and then, having read an entire [prefix of a] game, you iterate the moves with the board state and convert the types to one with full information. During this process you could record which data is superfluous in the context of the board state, which would inform what notations were sensible.

(Or instead of entire games/prefixes, you have a game state which can convert a move with partial information into a full one serially.)

It's a dynamic approach, but I doubt there's a tractable compile-time approach given the combinatorial nature of chess.

The board is not always necessary:
Rxd6 -> Rd6,
RxBd6 -> Rxd6,
etc.

I want converting between them without providing the board to be also possible.

Sorry, what do you mean?

I know there's at least one solution, and that's making a separate type for each of them, making an IntoWithBoardState trait and implementing a crap ton of traits for each of them.

The Rd2xBd6 -> Rxd6 conversion is not valid if you don't know the board state. For instance, if there is another rook at e6 then this is not allowed.

1 Like

You're absolutely right, my bad!

That's impossible without introducing the ability to have ambiguous moves in the context of the game, e.g. potentially making it impossible to unambiguously deserialize a sequence of moves into a game.

If you're fine with that possibility, then sure, you can always discard information.

I made a mistake in the second example, see my edit and @tczajka's comment.

Something roughly like this.

If you have a type for each set of definitely-included-information, you won't be able to have a Vec of moves representing a game without type erasure for example, because a game may need more than one set of information over the course of the game (as there may be some moves that would be ambiguous without more than the minimal information demanded by the notation).

If you compute which information is implied by the context of the game, you can dynamically know the minimal necessary information to display for a given move and notation.

I feel converting notation should update some property of the move or game (e.g. an enum), as opposed to be a type.

1 Like

I don't see any practical use for having a separate type for moves without the origin square (like Rxd6). Some moves can't be represented that way, so what's the point?

I would go with option 1: have one type that contains all the information. You can then have functions like algebraic_notation, reversible_notation etc that convert it to a string using different notations.

Short algebraic notation is only meant to be meaningful in the context of a specific position, so I don't see having to provide the position during parsing as a disadvantage.

2 Likes

You're misunderstanding me, and that's probably my fault :smile:.
The origin square will be represented by either the row and column, or the piece (and optionally row and column for disambiguation)

It sounds like you're trying to represent fallibility of conversions, so why not go with TryFrom when looking to expand without board state information?

1 Like

An alternative for this is to use a struct with generic types and start with unit structs () for all fields:

struct AlgebraicMove<P, Rk, Rw, C, T> 
where
    P: MaybePiece,
    /// etc.
{
    origin_piece: P,
    origin_rank: Rk,
    origin_row: Rw,
    capture: C,
    target_square: T,
}

impl AlgebraicMove<(), (), (), (), ()> {
    fn new() -> Self {
        // starts with all unit structs
    }
}

trait MaybePiece {}
impl MaybePiece for () {}

// Lots more, but can be a state machine / stateful builder

You can then make a supertrait for filled in values

enum Piece { Pawn, Biship, ... }
trait HasPiece: MaybePiece {}
impl MaybePiece for Piece {}
impl HasPiece for Piece {}

What you're talking about is not different types of moves. They are same moves, just written down using different notations. You mentioned three notations for moves. Are there more than three notations you're considering, since you're talking about a "complex network" of conversions?

It's also not clear why you want to represent different notations for moves differently. Can you provide some example user code that would make use of this?

If the type is supposed to represent the concept of a chess move, then its representation should probably be independent of how it is written down, similarly to how a u32 is the same value regardless of whether it was originally written in decimal or in hexadecimal notation. It's only when converting to or from a string that a user typically cares about the notation.

4 Likes

Here's is an implementation which encodes the optionality into the type system, thus giving compile-time guarantees that fields are set and giving set fields as immediately unwrappable Copy values.
playground

It's a lot of boilerplate (and I wonder if a proc macro could handle some of it) but it allows for things like:

impl<P, C> Into<MinimalMove> for ChessMove<P, Rank, Row, C, Square>
where
    P: MaybePiece + Copy,
    C: MaybeCapture + Copy,
{
    fn into(&self) -> MinimalMove {
        MinimalMove {
            origin_rank: self.origin_rank,
            origin_row: self.origin_row,
            target_square: self.target_square,
        }
    }
}