Empty crossword solver in Python

I'm given a matrix containing a blueprint of a crossword puzzle - unfilled, of course. The goal is to fill the whole puzzle - it's a task from Checkio, and I've been struggling with this for quite some time now.

From what I understand of complexity, there's no perfect algorithm for this problem. Still, there has to be the best way to do this, right? I've tried some different things, and results were not that good with increasing number of words in the crossword and/or dictionary.

So, some of the things I've tried:

simple brute forcing. Did not work at all, as it has been ignoring and overwriting intersections.
brute forcing while keeping all the relevant data - worked as expected with a specific dictionary, turned to hell with a moderately big one even with word-length optimization. Figures.
blind intersection filling - the idea where I thought it would be better not to bother with the intersecting words, instead focusing on the letters. Like start with As and check if you can fill the whole crossword with these restrictions. If it did not work for some word, increment one of the letters and try the whole thing again. Results were abysmal as you can expect.
recursive exploring - worked perfectly on more simple blueprints, but fell flat with more complex ones. There was an issue with simple loops which was resolved simply enough, but I did not find a reasonable solution for the situation where the path splits and then rejoins several further splits later (so there's nothing left to solve for the second branch, but it doesn't know that).
minimizing intersections - haven't tested this yet, but it looks promising. The idea is that I find the shortest list of words containing all intersections... that also don't intersect with each other. Then I can just use a generator for each of those words, and then check if the depending words with those intersections exist. If they don't, I just grab the next word from a generator.

And this is where I'm currently at. I decided to ask about this here as it's already at that point where I think it took more time than it should have, and even then my latest idea may not even be the proper way to do it.

So, what is the proper way to do it?

Edit: Input is a list of strings representing the crossword and a list of strings representing the dictionary. Output is a list of strings representing the filled crossword.

And example of a crossword:

['...XXXXXX', 
 '.XXX.X...', 
 '.....X.XX', 
 'XXXX.X...', 
 'XX...X.XX', 
 'XX.XXX.X.', 
 'X......X.', 
 'XX.X.XXX.', 
 'XXXX.....']

The output would be a similar list with filled letters instead of dots.

Note that the 'dictionary' is just that, a small English dictionary and not a list of words fitted as answers for this puzzle.

So, what is the proper way to do it?

I don't know if it is optimal, but I would be using the principles of Floodfill.

Data structures:

Crossword words and their intersections. Sort them by the number of words in the dictionary for the corresponding word length. This will most likely mean that you will start with one of the longest words.

Dictionary accessible by word length.

If the dictionary is large it would be beneficial to be able to quickly find words of a certain length with a specific n:th letter, where n corresponds to an intersection position.

Note that for each crossword word, any two words that fit and have the same letters in all intersections are equivalent. Thus, it is possible to select a subset from the dictionary for each crossword word. The subset is the set of equivalence classes. So for each crossword word you can create a subset of the dictionary that contains at most [the number of letters in the alphabet] to the power of [the number of intersections]. This subset would constitute the equivalence classes that might fit a particular crossword word.

Algorithm:

Take the first/next unsolved crossword word. Assign it the first/next word that fits.
Take the first/next intersection. Assign the other crossword word the first word that fits.
If there are no more intersections to follow onwards, go back to the intersection you came from and continue with the next intersection.
If there is no word in the dictionary that fits, backtrack one intersection and search for the next word that fits.

Recommended topics

Hot tags