Sudoku Puzzle with boxes containing square numbers

Asked 18/3, 2020 at 2:50 Answered 5/11, 2020 at 3:31

Two days ago, I was given a sudoku problem that I tried to solve with Python 3. I've been informed that a solution does exist, but I'm not certain if there exists multiple solutions.

The problem is as following: A 9x9 grid of sudoku is completely empty. It does however contain colored boxes, and inside these boxes, the sum of the numbers has to be a square number. Other than that, normal sudoku rules apply.

The issue here is not solving a sudoku puzzle, but rather generating a viable puzzle, that satisfies the rules of the colored boxes.

My strategy

Using numpy arrays, I have divided the grid into 81 indices, which can be rearranged to a 9x9 grid.

import numpy as np
print(np.array([i for i in range(81)]).reshape((9, 9)))

->
[[ 0  1  2  3  4  5  6  7  8]
 [ 9 10 11 12 13 14 15 16 17]
 [18 19 20 21 22 23 24 25 26]
 [27 28 29 30 31 32 33 34 35]
 [36 37 38 39 40 41 42 43 44]
 [45 46 47 48 49 50 51 52 53]
 [54 55 56 57 58 59 60 61 62]
 [63 64 65 66 67 68 69 70 71]
 [72 73 74 75 76 77 78 79 80]]

Here is a list containing all the blocks of indices.

boxes = [[44, 43, 42, 53],[46, 47, 38],[61, 60],[69, 70],[71, 62],
         [0, 9, 18],[1, 10, 11, 20],[2, 3, 12],[4, 13, 14],[5, 6],
         [7, 8],[17, 26, 35],[21, 22, 23],[15, 16, 24, 25, 34],
         [27, 36, 37],[19, 28, 29],[45, 54],[55, 56],[63, 64, 65],
         [72, 73, 74],[57, 66, 75 ],[58, 59, 67, 68],[76, 77],[78, 79, 80]]

As you can see from the picture, or from the array above, the boxes are arranged into blocks of 2, 3, 4, or 5 (8 twos, 12 threes, 3 fours, 1 fiver). I've also noticed that a box can contain multiple numbers without breaking any rules of sudoku, but only 2 of one number is possible. Given that information, the biggest possible square would be 36, as 9+9+8+7+6 = 39, and thus no sum of a block could ever reach 49. To find out if the sum of a list contains a square number, I've made the following function:

def isSquare(array):
    if np.sum(array) in [i**2 for i in range(1,7)]:
        return True
    else:
        return False

To find out if a list contain the correct amount of duplicates, that is, more than one duplicate of only one number, I've made the following function:

def twice(array):
    counter = [0]*9
    for i in range(len(array)):
        counter[array[i]-1]+=1
        if 3 in counter:
            return False
    if counter.count(2)>1:
        return False
    return True

Now, given the digits 1-9, there are limited ways solutions to a list, if the list has to sum into a square number. Using itertools, I could find the solutions, dividing them into an array, where index 0 contains blocks of twos, index 1 contains blocks of threes, and so on.

from itertools combinations_with_replacement
solutions = []
for k in range(2, 6):
    solutions.append([list(i) for i in combinations_with_replacement(np.arange(1, 10), k) if 
    isSquare(i) and twice(i)])

However, any permutation of these lists are viable solutions to the "square problem". Using itertools again, the total amount of possible boxes (without the sudoku rules) sums to 8782.

from itertools import permutations

def find_squares():
    solutions = []
    for k in range(2, 6):
        solutions.append([list(i) for i in combinations_with_replacement(np.arange(1, 10), k) if 
            isSquare(i) and twice(i)])
    s = []
    for item in solutions:
        d=[]
        for arr in item:
            for k in permutations(arr):
                d.append(list(k))
        s.append(d)
    return s # 4-dimensional array, max 2 of each

solutions = find_squares()

total = sum([len(i) for i in solutions])
print(total)
-> 8782

This should be enough to implement functionality that decides if a board is legal, that is, rows, columns and boxes only contains one each of the digits 1-9. My implementation:

def legal_row(arr):
    for k in range(len(arr)):
        values = []
        for i in range(len(arr[k])):
            if (arr[k][i] != 0):
                if (arr[k][i] in values):
                    return False
                else:
                    values.append(arr[k][i])
    return True

def legal_column(arr):
    return legal_row(np.array(arr, dtype=int).T)


def legal_box(arr):
    return legal_row(arr.reshape(3,3,3,3).swapaxes(1,2).reshape(9,9))


def legal(arr):
    return (legal_row(arr) and legal_column(arr) and legal_box(arr))

Difficulties with runtime

A straightforward approach would be to check every single combination of every single block. I have dones this, and produced several viable problems, however the complexity of my algorithm makes this take far too long time.

Instead, I tried to randomize some of the properties: The order of the blocks and the order of the solutions. Using this, I limited the number of tries, and checked if a solution was viable:

attempts = 1000
correct = 0
possibleBoards = []
for i in range(1, attempts+1):
    board = np.zeros((9, 9), dtype=int)
    score = 0
    shapes = boxes
    np.random.shuffle(shapes)
    for block in shapes:
        new_board = board
        new_1d = board.reshape(81)
        all_sols = solutions[len(block)-2]
        np.random.shuffle(all_sols)
        for sols in all_sols:
            #print(len(sols))
            new_1d[block] = sols
            new_board = new_1d.reshape((9, 9))
            if legal(new_board):
                board = new_board
                score+=1
                break
    confirm = board.reshape(81)
    #solve(board) # Using my solve function, not important here
    # Note that without it, correct would always be 0 as the middle of the puzzle has no boxes
    confirm = board.reshape(81)
    if (i%1000==0 or i==1):
        print("Attempt",i)
    if 0 not in confirm:
        correct+=1
        print(correct)
        possibleBoards.append(board)

In the code above, the variable score refers to how many blocks the algorithm could find during an attempt. The variable correct refers to how many of the generated sudoku boards could be completed. If you are interested in how well it did in 700 attempts, here are some stats (This is a historgram, the x-axis represents the scores, and the y-axis represents how many of each score was present during these 700 attempts).

What I need help with

I am struggling to find a feasible way to find a solution to this problem, that can actually run in a finite amount of time. I would greatly appreciate any tips regarding making some of my code faster or better, any ideas of a different approach to the problem, any solutions to the problem, or some useful tips about Python/Numpy relevant to this problem.

Teacup answered 18/3, 2020 at 2:50 Comment(4)

isSquare(a): return math.sqrt(sum(a)) % 1.0 == 0 or so. Pretty sure that's way faster than recomputing [i**2 for i in range(1,7)] every time and doing a linear search in it. Also, in already returns a boolean. No need for the if. – Rewrite 18/3, 2020 at 3:10

Keep in mind that the solutions = find_squares() part takes less than a second, it is the last part, the implementation where I have to find out if the answer is correct that is slow. – Teacup 18/3, 2020 at 3:13

FYI (also for anyone else who reads this) you can watch an explanation of the puzzle here: youtube.com/watch?v=myGqOF6blPI and there's a link in the description to play online. It's a fantastic puzzle and that channel is great. I actually solved this puzzle yesterday so I was surprised to see this! – Brucebrucellosis 24/3, 2020 at 19:50

Wow that's great, I really enjoyed watching him solving this. Thanks for the tip! – Teacup 17/11, 2020 at 21:1

This is where I would use a SMT solver. They are a lot more powerful than people give credit for. If the best algorithm you can think of is essentially bruteforce, try a solver instead. Simply listing your constraints and running it gives your unique answer in a couple seconds:

The code used (and reference image for coordinates):

import z3

letters = "ABCDEFGHI"
numbers = "123456789"
boxes = """
A1 A2 A3
B1 B2 C2 C3
C1 D1 D2
E1 E2 F2
F1 G1
H1 I1
G2 H2 G3 H3 H4
I2 I3 I4
B3 B4 C4
D3 E3 F3
A4 A5 B5
C5 B6 C6
G5 H5 I5 I6
A6 A7
B7 C7
D7 D8 D9
E7 E8 F7 F8
G7 H7
I7 I8
A8 B8 C8
G8 H8
A9 B9 C9
E9 F9
G9 H9 I9
"""
positions = [letter + number
             for letter in letters
             for number in numbers]
S = {pos: z3.Int(pos) for pos in positions}

solver = z3.Solver()

# Every symbol must be a number from 1-9.
for symbol in S.values():
    solver.add(z3.Or([symbol == i for i in range(1, 10)]))

# Every row value must be unique.
for row in numbers:
    solver.add(z3.Distinct([S[col + row] for col in letters]))

# Every column value must be unique.
for col in letters:
    solver.add(z3.Distinct([S[col + row] for row in numbers]))

# Every block must contain every value.
for i in range(3):
    for j in range(3):
        solver.add(z3.Distinct([S[letters[m + i * 3] + numbers[n + j * 3]]
                                for m in range(3)
                                for n in range(3)]))

# Colored boxes.
for box in boxes.split("\n"):
    box = box.strip()
    if not box: continue
    boxsum = z3.Sum([S[pos] for pos in box.split()])
    solver.add(z3.Or([boxsum == 1, boxsum == 4, boxsum == 9,
                      boxsum == 16, boxsum == 25, boxsum == 36]))

# Print solutions.
while solver.check() == z3.sat:
    model = solver.model()
    for row in numbers:
        print("".join(model.evaluate(S[col+row]).as_string()
                    for col in letters))
    print()

    # Prevent next solution from being equivalent.
    solver.add(z3.Or([S[col+row] != model.evaluate(S[col+row])
                      for col in letters
                      for row in numbers]))

Mucro answered 18/3, 2020 at 4:17 Comment(3)

That looks promising! I would never have thought of that. Do you have any documentation for the installation of z3? Also, does this mean that there is only one single solution? – Teacup 18/3, 2020 at 4:24

@Teacup python -m pip install z3-solver should get Z3. After editing the code it now prints all satisfiable solutions. – Mucro 18/3, 2020 at 4:29

@Teacup After fixing a bug it now iterates over all satisfiable solutions. However, it only finds one, so the solution is unique. – Mucro 18/3, 2020 at 4:35

A list called Boxes is created with 9 elements, each being another list. These 9 lists correspond to each of the 9 boxes, and each of the lists contains tuples as the elements with the row and column indices for each square in that box. Explicitly entering the values in a similar way to the following would have had the same effect (but would have been a waste of time):

# The boxes list is created, with the row and column index of each square in each box
Boxes = [
    [(3*i+k+1, 3*j+l+1) for k in range(3) for l in range(3)]
    for i in range(3) for j in range(3) ]

Weatherman answered 5/11, 2020 at 3:31 Comment(1)

Hey! Your code snippet does not run. That's interesting, but if I had an explicit formula for creating the boxes, I would probably have the solution as well... – Teacup 17/11, 2020 at 21:0

Recommended topics

Hot tags