Recursive backtracking in Java for solving a crossword
Asked Answered
M

2

4

I need to solve a crossword given the initial grid and the words (words can be used more than once or not at all).

The initial grid looks like that:

++_+++
+____+
___+__
+_++_+
+____+
++_+++

Here is an example word list:

pain
nice
pal
id

The task is to fill the placeholders (horizontal or vertical having length > 1) like that:

++p+++
+pain+
pal+id
+i++c+
+nice+
++d+++

Any correct solution is acceptable, and it's guaranteed that there's a solution.


In order to start to solve the problem, I store the grid in 2-dim. char array and I store the words by their length in the list of sets: List<Set<String>> words, so that e.g. the words of length 4 could be accessed by words.get(4)

Then I extract the location of all placeholders from the grid and add them to the list (stack) of placeholders:

class Placeholder {
    int x, y; //coordinates 
    int l; // the length
    boolean h; //horizontal or not
    public Placeholder(int x, int y, int l, boolean h) {
        this.x = x;
        this.y = y;
        this.l = l;
        this.h = h;
    }
}

The main part of the algorithm is the solve() method:

char[][] solve (char[][] c, Stack<Placeholder> placeholders) {
    if (placeholders.isEmpty())
        return c;

    Placeholder pl = placeholders.pop();
    for (String word : words.get(pl.l)) {
        char[][] possibleC = fill(c, word, pl); // description below
        if (possibleC != null) {
            char[][] ret = solve(possibleC, placeholders);
            if (ret != null)
                return ret;
        }
    }
    return null;
}

Function fill(c, word, pl) just returns a new crossword with the current word written on the current placeholder pl. If word is incompatible with pl, then function returns null.

char[][] fill (char[][] c, String word, Placeholder pl) {

    if (pl.h) {
        for (int i = pl.x; i < pl.x + pl.l; i++)
            if (c[pl.y][i] != '_' && c[pl.y][i] != word.charAt(i - pl.x))
                return null;
        for (int i = pl.x; i < pl.x + pl.l; i++)
            c[pl.y][i] = word.charAt(i - pl.x);
        return c;

    } else {
        for (int i = pl.y; i < pl.y + pl.l; i++)
            if (c[i][pl.x] != '_' && c[i][pl.x] != word.charAt(i - pl.y))
                return null;
        for (int i = pl.y; i < pl.y + pl.l; i++)
            c[i][pl.x] = word.charAt(i - pl.y);
        return c;
    }
}

Here is the full code on Rextester.


The problem is that my backtracking algorithm doesn't work well. Let's say this is my initial grid:

++++++
+____+
++++_+
++++_+
++++_+
++++++

And this is the list of words:

pain
nice

My algorithm will put the word pain vertically, but then when realizing that it was a wrong choice it will backtrack, but by that time the initial grid will be already changed and the number of placeholders will be reduced. How do you think the algorithm can be fixed?

Marcela answered 21/6, 2017 at 21:11 Comment(2)
Possible duplicate of Solving crosswordsBondswoman
Hahahahaha, That is what I call crossfire :DDDMarcela
I
4

This can be solved in 2 ways:

  • Create a deep copy of the matrix at the start of fill, modify and return that (leaving the original intact).

    Given that you already pass around the matrix, this wouldn't require any other changes.

    This is simple but fairly inefficient as it requires copying the matrix every time you try to fill in a word.

  • Create an unfill method, which reverts the changes made in fill, to be called at the end of each for loop iteration.

    for (String word : words.get(pl.l)) {
        if (fill(c, word, pl)) {
            ...
            unfill(c, word, pl);
        }
    }
    

    Note: I changed fill a bit as per my note below.

    Of course just trying to erase all letter may erase letters of other placed words. To fix this, we can keep a count of how many words each letter is a part of.

    More specifically, have a int[][] counts (which will also need to be passed around or be otherwise accessible) and whenever you update c[x][y], also increment counts[x][y]. To revert a placement, decrease the count of each letter in that placement by 1 and only remove letters with a count of 0.

    This is somewhat more complex, but much more efficient than the above approach.

    In terms of code, you might put something like this in fill:
    (in the first part, the second is similar)

    for (int i = pl.x; i < pl.x + pl.l; i++)
        counts[pl.y][i]++;
    

    And unfill would look something like this: (again for just the first part)

    for (int i = pl.x; i < pl.x + pl.l; i++)
        counts[pl.y][i]--;
    for (int i = pl.x; i < pl.x + pl.l; i++)
        if (counts[pl.y][i] == 0)
            c[pl.y][i] = '_';
    // can also just use a single loop with "if (--counts[pl.y][i] == 0)"
    

Note that, if going for the second approach above, it might make more sense to simply have fill return a boolean (true if successful) and just pass c down to the recursive call of solve. unfill can return void, since it can't fail, unless you have a bug.

There is only a single array that you're passing around in your code, all you're doing is changing its name.

See also Is Java "pass-by-reference" or "pass-by-value"?

Incurrence answered 21/6, 2017 at 22:40 Comment(7)
Do you suggest to put c = unfill(c, word, pl); at the end of for loop?Marcela
@Marcela It shouldn't matter either way. unfill doesn't even have to return anything. There's only 1 array, passing it to a function doesn't make a copy of it. The "best" approach would be to have fill return a boolean and unfill return void. See edit to answer.Incurrence
I did everything as you suggested, please look at rextester.com/NFDH24126 . You will find the input there. In the output the program leaves one placeholder unsolved.Marcela
Seems like placeholders.add(pl); should be added after unfill, but it doesn't help :/Marcela
@Marcela See fixed code here. You need to actually check the return value of fill, and push to placeholders at the end of the loop.Incurrence
@Marcela You code also seems to be missing a check to ensure the word and the placeholder are the same length in fill.Incurrence
Thanks for everything! Actually it's not possible, as I am only looping through the words of length equal to the length of placeholder. Now everything is good except time limit :D But I think it's doable if I change the stack to smth else and also look to the adjacent placeholders instead of trivially popping.Marcela
S
1

You identified it yourself:

it will backtrack, but by that time the initial grid will be already changed

That grid should be a local matrix, not a global one. That way, when you back up with a return of null, the grid from the parent call is still intact, ready to try the next word in the for loop.

Your termination logic is correct: when you find a solution, immediately pass that grid back up the stack.

Snigger answered 21/6, 2017 at 21:57 Comment(4)
Could you please add the code? Which matrix were you talking about? I think possibleC matrix is a local oneMarcela
I also believe it's local -- but let's work on debugging technique: what is the "initial grid" you said was already changed? Have you printed out your master grid at appropriate points to track data flow?Snigger
Please look at the last example, where there are only 2 placeholders. "Initial grid" is when there are no letters on the grid. Then my algorithm chooses a vertical placeholder, then wrongly fills it with the word pain, then recurses and there tries to fill the horizontal placeholder with nice and fails, then it backtracks and tries to fill the vertical placeholder (1st one), but now it cannot (because there's a bug) as the initial grid is spoiled with the word painMarcela
Right. Dukeling just gave you a good answer, just what I was trying to lead you toward.Snigger

© 2022 - 2024 — McMap. All rights reserved.