Two days ago, I was given a sudoku problem that I tried to solve with Python 3. I've been informed that a solution does exist, but I'm not certain if there exists multiple solutions.
The problem is as following: A 9x9 grid of sudoku is completely empty. It does however contain colored boxes, and inside these boxes, the sum of the numbers has to be a square number. Other than that, normal sudoku rules apply.
The issue here is not solving a sudoku puzzle, but rather generating a viable puzzle, that satisfies the rules of the colored boxes.
My strategy
Using numpy arrays, I have divided the grid into 81 indices, which can be rearranged to a 9x9 grid.
import numpy as np
print(np.array([i for i in range(81)]).reshape((9, 9)))
->
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
Here is a list containing all the blocks of indices.
boxes = [[44, 43, 42, 53],[46, 47, 38],[61, 60],[69, 70],[71, 62],
[0, 9, 18],[1, 10, 11, 20],[2, 3, 12],[4, 13, 14],[5, 6],
[7, 8],[17, 26, 35],[21, 22, 23],[15, 16, 24, 25, 34],
[27, 36, 37],[19, 28, 29],[45, 54],[55, 56],[63, 64, 65],
[72, 73, 74],[57, 66, 75 ],[58, 59, 67, 68],[76, 77],[78, 79, 80]]
As you can see from the picture, or from the array above, the boxes are arranged into blocks of 2, 3, 4, or 5 (8 twos, 12 threes, 3 fours, 1 fiver). I've also noticed that a box can contain multiple numbers without breaking any rules of sudoku, but only 2 of one number is possible. Given that information, the biggest possible square would be 36, as 9+9+8+7+6 = 39, and thus no sum of a block could ever reach 49. To find out if the sum of a list contains a square number, I've made the following function:
def isSquare(array):
if np.sum(array) in [i**2 for i in range(1,7)]:
return True
else:
return False
To find out if a list contain the correct amount of duplicates, that is, more than one duplicate of only one number, I've made the following function:
def twice(array):
counter = [0]*9
for i in range(len(array)):
counter[array[i]-1]+=1
if 3 in counter:
return False
if counter.count(2)>1:
return False
return True
Now, given the digits 1-9, there are limited ways solutions to a list, if the list has to sum into a square number. Using itertools, I could find the solutions, dividing them into an array, where index 0 contains blocks of twos, index 1 contains blocks of threes, and so on.
from itertools combinations_with_replacement
solutions = []
for k in range(2, 6):
solutions.append([list(i) for i in combinations_with_replacement(np.arange(1, 10), k) if
isSquare(i) and twice(i)])
However, any permutation of these lists are viable solutions to the "square problem". Using itertools again, the total amount of possible boxes (without the sudoku rules) sums to 8782.
from itertools import permutations
def find_squares():
solutions = []
for k in range(2, 6):
solutions.append([list(i) for i in combinations_with_replacement(np.arange(1, 10), k) if
isSquare(i) and twice(i)])
s = []
for item in solutions:
d=[]
for arr in item:
for k in permutations(arr):
d.append(list(k))
s.append(d)
return s # 4-dimensional array, max 2 of each
solutions = find_squares()
total = sum([len(i) for i in solutions])
print(total)
-> 8782
This should be enough to implement functionality that decides if a board is legal, that is, rows, columns and boxes only contains one each of the digits 1-9. My implementation:
def legal_row(arr):
for k in range(len(arr)):
values = []
for i in range(len(arr[k])):
if (arr[k][i] != 0):
if (arr[k][i] in values):
return False
else:
values.append(arr[k][i])
return True
def legal_column(arr):
return legal_row(np.array(arr, dtype=int).T)
def legal_box(arr):
return legal_row(arr.reshape(3,3,3,3).swapaxes(1,2).reshape(9,9))
def legal(arr):
return (legal_row(arr) and legal_column(arr) and legal_box(arr))
Difficulties with runtime
A straightforward approach would be to check every single combination of every single block. I have dones this, and produced several viable problems, however the complexity of my algorithm makes this take far too long time.
Instead, I tried to randomize some of the properties: The order of the blocks and the order of the solutions. Using this, I limited the number of tries, and checked if a solution was viable:
attempts = 1000
correct = 0
possibleBoards = []
for i in range(1, attempts+1):
board = np.zeros((9, 9), dtype=int)
score = 0
shapes = boxes
np.random.shuffle(shapes)
for block in shapes:
new_board = board
new_1d = board.reshape(81)
all_sols = solutions[len(block)-2]
np.random.shuffle(all_sols)
for sols in all_sols:
#print(len(sols))
new_1d[block] = sols
new_board = new_1d.reshape((9, 9))
if legal(new_board):
board = new_board
score+=1
break
confirm = board.reshape(81)
#solve(board) # Using my solve function, not important here
# Note that without it, correct would always be 0 as the middle of the puzzle has no boxes
confirm = board.reshape(81)
if (i%1000==0 or i==1):
print("Attempt",i)
if 0 not in confirm:
correct+=1
print(correct)
possibleBoards.append(board)
In the code above, the variable score refers to how many blocks the algorithm could find during an attempt. The variable correct refers to how many of the generated sudoku boards could be completed. If you are interested in how well it did in 700 attempts, here are some stats (This is a historgram, the x-axis represents the scores, and the y-axis represents how many of each score was present during these 700 attempts).
What I need help with
I am struggling to find a feasible way to find a solution to this problem, that can actually run in a finite amount of time. I would greatly appreciate any tips regarding making some of my code faster or better, any ideas of a different approach to the problem, any solutions to the problem, or some useful tips about Python/Numpy relevant to this problem.
isSquare(a): return math.sqrt(sum(a)) % 1.0 == 0
or so. Pretty sure that's way faster than recomputing[i**2 for i in range(1,7)]
every time and doing a linear search in it. Also,in
already returns a boolean. No need for theif
. – Rewrite