Aaaarg, I wish I saw this thread sooner. I am familiar with Zobrist hashing, and have this to say.
I cannot see any reason why somebody would want to implement Zobrist hashing through the Hasher API. At all. There is no point, because if you have any legitimate reason to be using Zobrist hashing, then you would never want to be placing these items into a HashSet or HashMap.
Let me explain for the uninitiated, because the Wikipedia page kind of gives the wrong idea. Wikipedia shows this example:
white_pawn := 1
white_rook := 2
black_king := 12
# fill a table of random numbers/bitstrings
table := a 2-d array of size 64×12
for i from 1 to 64: # loop over the board, represented as a linear array
for j from 1 to 12: # loop over the pieces
table[i][j] = random_bitstring()
h := 0
for i from 1 to 64: # loop over the board positions
if board[i] != empty:
j := the piece at board[i], as listed in the constant indices, above
h := h XOR table[i][j]
What’s misleading about this snippet is that the function
hash(board) is not something you should be frequently calling, if ever. Every time you call
hash(board) it completely defeats the purpose of the Zobrist hash.
The following might be a more accurate description:
# same as before
# same as before
# This is the "hash" function from before, but with a more accurate name.
# Most legitimate use cases of Zobrist hashing do not even need this at all,
# as you can often get away by declaring the initial state of the board
# (at the beginning of whatever algorithm uses Zobrist) to have a hash of 0.
h := 0
for pos from 1 to 64:
h := zobrist_update(h, pos, board[pos])
# Called to update the hash in response to "toggling" (inserting or removing)
# the existence of a given piece at a given position.
function zobrist_update(hash, position, piece):
if piece != empty:
hash := hash XOR table[position][piece]
# Helper function that updates both the board and hash
function board_set_piece(board, hash, position, new_piece):
old_piece := board[position]
board[position] := new_piece
hash := zobrist_update(hash, position, old_piece) # remove old piece
hash := zobrist_update(hash, position, new_piece) # insert new piece
return (board, hash)
The whole entire point of a Zobrist hash is that it can be computed incrementally as changes are made to an object. It provides a super speedy heuristic for answering questions like “is our algorithm stuck in a loop?” by allowing you to compare a board’s current state to previous states without O(size) time per comparison or even O(size) setup per state visited.
I used zobrist hashes in a simulation on a large atomic network as part of an effort to make sure that not a single part of my code performed
O(network_size) work per step of the simulation. I never stored copies of old states; I only stored the “deltas”, i.e. which atoms were changed, even sometimes applying them in reverse if I needed to inspect a previous state. These same deltas were used to update a zobrist hash which was used to detect whenever the simulation got stuck in a cycle.
Storing items in a HashSet or HashMap using Zobrist hashes (if you even could do it!) would completely defeat the purpose because HashSet and HashMap still need to check that two items are equal to protect against hash collisions. This O(n) equality test will invalidate any performance gains you would have gotten from having incrementally computed hashes.