Square word solution finder

Square word is a word game similar to wordle, but it's a 5x5 grid, and all rows and columns are actual words. I've created a program to find all possible square word solutions, but it takes days to run. I'd love to get some feedback on runtime speed, and code quality as this is my first 'big' rust program.

This is the first step in finding optimal guesses for an unknown solution.

Here's the code: GitHub - roboteng/square-word-solver

That's not going to be easy to improve. The naive brute-force solution is exponential in the number of squares, and 26^(5 * 5) is not exactly a small number, even if you apparently reduce it somewhat by filtering valid words. Since the problem somewhat resembles discrete tomography, I'd expect it to be NP-hard. I'm not sure how much it is easier compared to a general constraint satisfaction problem, but it's likely that you'll need some fairly advanced heuristics and combinatorial optimization to reduce runtimes to something bearable.

You should first and foremost run Clippy on your code, that should get you most of the way in terms of low-level code quality improvements. Some comments I have, in no particular order of importance:

  • you are doing a lot of matching-then-panicking on Results. Consider just using .expect() instead, or better yet, in library code, propagate errors using the ? operator.
  • In function five_letter_words(), it's unnecessary to convert str slices to owned Strings. You could just return a Vec<&str>.
  • In methods Solution::columns() and Solutions::is_unique(), you are repeating the allocate-empty-collection-and-push pattern. You should be using Iterator::collect() instead.
  • In the words field of WordList, boxing the WordList is unnecessary. You should just make the field a HashMap<char, WordList>.
  • In find_solutions(), you are using a lot of Arc, putting a Vec inside an Arc (which seems like a weird double indirection), and you are effectively cloning the whole Vec at the end. This could likely be improved to use far less indirection.
  1. square-word-solver/lib.rs at main · roboteng/square-word-solver · GitHub is probably killing your performance

  2. you should represent rows/vecs as [u8; 5]; instead of Vec to avoid the heap allocation

  3. I'd also represent partial solutions / solutions as [[u8; 5];5] for better memory locality

  4. lastly, the name of the game should be to kill branches as quickly as possible, so if you already have

# # . . .
# # . . .
# # . . .
# # . . .
# # . . .

instead of filling in the next column (and dealing with N^3) it probably makes sense to look at rows (conditioned restricted on the first two letters, dealing with (N/AB)^5 ) [and you can prune this even further]; basically you should build a dictionary of prefixes of words, and use it to prune as aggressively as possible

1 Like

Thanks for the tips! I've already made some of the easier changes, but I'll try to make some of the other changes like arrays instead of vecs, and see how that performs

I agree with @zeroexcuses that the goal should be killing branches aggressively. Here I suggest a way to do that.

Let's consider how you could proceed by choosing one word per column. Once you choose a word for the first column, the only words eligible for the first row are those that start with the first letter of the chosen word, the words eligible for the second row are those that start with the second letter of the chosen word, and so on.

If you have a sorted list of all the words, each list of eligible words is a contiguous sublist. Instead precomputing prefix lists, you could use binary search to shrink the eligibility lists. In this way, you can store a slice per row as the eligibility list, with no allocation. (If an eligibility list shrinks to zero, there are no solutions for the partially filled square.)

If instead you filled the top row, you could instead store and shorten lists of eligibility for each column.

So the next step is to combine these approaches, and at each step, choose to either fill the first unfilled column or fill the first unfilled row -- whichever has the smallest number of eligible words remaining.

Donald Knuth wrote a paper about the cubed version of this problem: "5 x 5 x 5 Word Cubes by Computer" (Word Ways 26 (1993); Selected Papers on Fun & Games (chapter 37)). He also discusses the square case; his dictionary of 5,757 five-letter words yielded 541,968 squares, 280 of which contained the same word more than once. (Edit: The solution counts were for symmetric squares only.)

They stated that they used Dancing Links for the problem. Dancing Links is linked-list based [1] and not necessarily a great fit for Rust, but we can consider more generally how Dancing Links works: It solves exact cover problems.

Exploration of an exact cover approach

How can we rework this problem as an exact cover problem? That is, you have a binary matrix, and you want to select a set of rows such that every column contains exactly one 1.

The columns here could be every unique combination of

  • Axis (horizontal or vertical)
  • Cell number (25 possibilities in your case)
  • Letter of the alphabet

Every row represents a word placed horizontally or vertically in a specific row or column; thus every word from the dictionary will get 10 rows.

A horizontally placed word would have 1s in the columns

  • Horizontal + occupied cell + letter in that cell
  • Vertical + occupied cell + (every letter not in that cell)

And vice-versa for vertically placed words. By this design, words that cross are forced to have the same letter in order to create an exact cover of the columns for that cell, and any two words in the same position always collide.

To make the description more concrete, below is an example solution to a 3x3 square over the alphabet "acent". The solution is

horizontal                                            vertical
1     2     3     4     5     6     7     8     9     1     2     3     4     5     6     7     8     9
acent acent acent acent acent acent acent acent acent acent acent acent acent acent acent acent acent acent
 x    x         x                                     x xxx              xxxx             xxxx
                  x      x      x                            xxxx             x xxx             xx xx
                                        x   x      x              xxxx              xx xx             xxx x
x xxx              xxxx             xxxx               x    x         x
       xxxx             x xxx             xx xx                         x      x      x
            xxxx              xx xx             xxx x                                         x   x      x
xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

The design can be expanded to cubes in a straightforward fashion.

To then solve the exact cover problem, you can use a backtracking reduction algorithm.

  • If the table is empty, you're at a solution; visit the solution and return. Else,
  • Choose the column with the least number of 1s
    • A column with no 1s indicates a dead end; return
  • For each row that has a 1 in the chosen column
    • Include the row in the partial solution
    • Build a reduced matrix
      • For every column covered by the row, remove any row that covers that column (collisions)
      • Remove those columns as well (or otherwise mark them as done)
    • Recurse the algorithm on the reduced matrix
    • Remove the row from the partial solution and continue looping on the non-reduced matrix

An empty table indicates a solution to the problem; a column with no 1s indicates a dead-end to backtrack out of.

The exact cover approach as outlined can be considered choosing an empty cell, and then iterating over either

  • All horizontal words with a specific letter and all vertical words without that letter, or
  • All vertical words with a specific letter and all horizontal words without that letter

The cell is chosen based on having the smallest number of words to iterate over. It's therefore close to a generalization of the prefix approach above, wherein you could choose any unfilled row or column, as opposed to just the first unfilled row or column.

  1. although I see a version using sparse sets exists now, hmmm ↩︎

1 Like

A correction: The 541,968 and 280 numbers are for symmetric squares only. The number of general solutions will be much higher.

Also, something I just thought of for the prefix approach: When you pick the first word -- let's say your place it in the first row -- you can limit the first column word to candidates lexicographically greater than or equal to the first word. This will cut the number of squares you find roughly in half, as you will eliminate most squares that are just mirrors across the diagonal; every square you find represents two solutions (unless completely symmetric across the diagonal).

(If you're filtering out squares with duplicate words, it can be strictly greater than.)

So, I've got a function that should be able to prune as quickly as possible. The WordList struct has a contains method. I believe this has an O(1) time complexity since it's just a bunch of hashMap lookups. (It really has O(n), where n is the length of the word, but that is no longer than 5 in this case). It's also able to check an incomplete word, as it doesn't look down the whole tree. That seems to me to be faster than a binary search, which would be O(log(n)), where n is the number of words. That seems to me to be the case, but I could be wrong.

My thought in implementing it that way is if any of the columns aren't contained in the tree, then I can stop looking down that branch of the solution space, and try the next one. Thoughts?

Are you using a bit vec?

27^5 = 14_348_907 , which, stored as a bitvec, is under 2MB and would likely fit in L2 cache.

Why 27 and not 26? So we can have queries of the form r#s#c i.e. "is there any word that word that starts with r, has middle letter s, and last character c. (Answer: yes, rustc).

The list shortening will be O(log(k)) where k is the number of eligible words, so O(log(words)) in the worse case, yes (though that's only in the teens). The ranges for prefixes could be pre-calculated to improve this, but it sounds like your way isn't prefix limited either, which could pay off.

With the prefix method, you know the count for each column, so you can still short circuit out even if it's not the next column/row to empty of candidates. If I understand your question, it sounds like "a column isn't contained in the tree" is the same idea.

I threw together a quick version of what I suggested (not further optimized, single threaded) and it took 30 minutes to enumerate 1,787,056 squares on a 15 year old desktop; even more crudely modifying it for the symmetric case found the 541,968 squares in about 4.5 seconds (using Knuth's 5757 word list for both cases; all it does aside from enumeration is keep count). I can share it if you'd like. Simple as it is, I'm certain it can be much improved without much effort (even just making it parallel might get you to a minute range on a modern processor). It might at least give you a baseline.

If you don't need to enumerate, but only need to count or generate randomly (say), there are probably even better approaches, like ZDDs. (At a guess, I haven't tried it.) ZDDs can also be used to store the word list and perform queries like r#s#c.

1 Like