How big are your boards?
If your boards are fairly small then you can solve the game exactly with dynamic programming. In Python:
n,m = 6,6
init = frozenset((x,y) for x in range(n) for y in range(m))
def moves(board):
return [frozenset([(x,y) for (x,y) in board if x < px or y < py]) for (px,py) in board]
@memoize
def wins(board):
if not board: return True
return any(not wins(move) for move in moves(board))
The function wins(board) computes whether the board is a winning position. The board representation is a set of tuples (x,y) indicating whether piece (x,y) is still on the board. The function moves computes list of boards reachable in one move.
The logic behind the wins function works like this. If the board is empty on our move the other player must have eaten the last piece, so we won. If the board is not empty then we can win if there is any
move we can do such that the resulting position is a losing position (i.e. not winning i.e. not wins(move)
), because then we got the other player into a losing position.
You'll also need the memoize helper function which caches the results:
def memoize(f):
cache = dict()
def memof(x):
try: return cache[x]
except:
cache[x] = f(x)
return cache[x]
return memof
By caching we only compute who is the winner for a given position once, even if this position is reachable in multiple ways. For example the position of a single row of chocolate can be obtained if the first player eats all remaining rows in his first move, but it can also be obtained through many other series of moves. It would be wasteful to compute who wins on the single row board again and again, so we cache the result. This improves the asymptotic performance from something like O((n*m)^(n+m))
to O((n+m)!/(n!m!))
, a vast improvement though still slow for large boards.
And here is a debugging printing function for convenience:
def show(board):
for x in range(n):
print '|' + ''.join('x ' if (x,y) in board else ' ' for y in range(m))
This code is still fairly slow because the code isn't optimized in any way (and this is Python...). If you write it in C or Java efficiently you can probably improve performance over 100 fold. You should easily be able to handle 10x10 boards, and you can probably handle up to 15x15 boards. You should also use a different board representation, for example a bit board. Perhaps you would even be able to speed it up 1000x if you make use of multiple processors.
Here is a derivation from minimax
We'll start with minimax:
def minimax(board, depth):
if depth > maxdepth: return heuristic(board)
else:
alpha = -1
for move in moves(board):
alpha = max(alpha, -minimax(move, depth-1))
return alpha
We can remove the depth checking to do a full search:
def minimax(board):
if game_ended(board): return heuristic(board)
else:
alpha = -1
for move in moves(board):
alpha = max(alpha, -minimax(move))
return alpha
Because the game ended, heuristic will return either -1 or 1, depending on which player won. If we represent -1 as false and 1 as true, then max(a,b)
becomes a or b
, and -a
becomes not a
:
def minimax(board):
if game_ended(board): return heuristic(board)
else:
alpha = False
for move in moves(board):
alpha = alpha or not minimax(move)
return alpha
You can see this is equivalent to:
def minimax(board):
if not board: return True
return any([not minimax(move) for move in moves(board)])
If we had instead started with minimax with alpha-beta pruning:
def alphabeta(board, alpha, beta):
if game_ended(board): return heuristic(board)
else:
for move in moves(board):
alpha = max(alpha, -alphabeta(move, -beta, -alpha))
if alpha >= beta: break
return alpha
// start the search:
alphabeta(initial_board, -1, 1)
The search starts out with alpha = -1 and beta = 1. As soon as alpha becomes 1, the loop breaks. So we can assume that alpha stays -1 and beta stays 1 in the recursive calls. So the code is equivalent to this:
def alphabeta(board, alpha, beta):
if game_ended(board): return heuristic(board)
else:
for move in moves(board):
alpha = max(alpha, -alphabeta(move, -1, 1))
if alpha == 1: break
return alpha
// start the search:
alphabeta(initial_board, -1, 1)
So we can simply remove the parameters, as they are always passed in as the same values:
def alphabeta(board):
if game_ended(board): return heuristic(board)
else:
alpha = -1
for move in moves(board):
alpha = max(alpha, -alphabeta(move))
if alpha == 1: break
return alpha
// start the search:
alphabeta(initial_board)
We can again do the switch from -1 and 1 to booleans:
def alphabeta(board):
if game_ended(board): return heuristic(board)
else:
alpha = False
for move in moves(board):
alpha = alpha or not alphabeta(move))
if alpha: break
return alpha
So you can see this is equivalent to using any with a generator which stops the iteration as soon as it has found a True value instead of always computing the whole list of children:
def alphabeta(board):
if not board: return True
return any(not alphabeta(move) for move in moves(board))
Note that here we have any(not alphabeta(move) for move in moves(board))
instead of any([not minimax(move) for move in moves(board)])
. This speeds up the search by about a factor of 10 for reasonably sized boards. Not because the first form is faster, but because it allows us to skip entire rest of the loop including the recursive calls as soon as we have found a value that's True.
So there you have it, the wins function was just alphabeta search in disguise. The next trick we used for wins is to memoize it. In game programming this would be called using "transposition tables". So the wins function is doing alphabeta search with transposition tables. Of course it's simpler to write down this algorithm directly instead of going through this derivation ;)