Java Minimax Alpha-Beta Pruning Recursion Return
Asked Answered
Q

5

18

I am trying to implement minimax with alpha-beta pruning for a checkers game in Java. My minimax algorithm works perfectly. My code runs with the alpha-beta code in place. Unfortunately, when I play 1000 games vs the standard minimax algorithm, the alpha-beta algorithm always comes out behind by 50 games or so.

Since alpha-beta pruning should not be reducing the quality of the moves, just the time it takes to achieve them, something has to be wrong. However, I have taken out pen and paper and drawn hypothetical leaf node values and used my algorithm to predict whether it will calculate the correct best move, and there doesn't appear to be any logic errors. I used the tree from this video: Alpha-Beta Pruning to trace my algorithm. It logically should make all of the same choices, and therefore be a functioning implementation.

I have also put print statements into the code (they have been removed to reduce the clutter), and values are being returned correctly it appears and pruning does happen. Despite my best efforts I have been unable to find where the logic error lies. This is my third different attempt at implementing this and all of them have had the same issue.

I can't post the full code here, it's much too long, so I have included the methods that are relevant to the error. I'm not certain, but I suspect the problem may likely be in the non-recursive move() method, though I can't find a logical error in it so I'd just be thrashing around in it more, probably making things worse rather than better without having a rhyme or reason.

Is there a trick to recovering multiple integer values from recursive calls in a for loop? It works fine with both my minimax and negamax implementations, but alpha-beta pruning seems to produce some strange results.

@Override
public GameState move(GameState state) 
{
    int alpha = -INFINITY;
    int beta = INFINITY;
    int bestScore = -Integer.MAX_VALUE;
    GameTreeNode gameTreeRoot = new GameTreeNode(state);
    GameState bestMove = null;
    for(GameTreeNode child: gameTreeRoot.getChildren())
    {
        if(bestMove == null)
        {
            bestMove = child.getState();
        }
        alpha = Math.max(alpha, miniMax(child, plyDepth - 1, alpha, beta));
        if(alpha > bestScore)
        {
            bestMove = child.getState();
            bestScore = alpha;
        }
    }
    return bestMove;
}

private int miniMax(GameTreeNode currentNode, int depth, int alpha, int beta) 
{
    if(depth <= 0 || terminalNode(currentNode.getState())) 
    {
        return getHeuristic(currentNode.getState());
    }
    if(currentNode.getState().getCurrentPlayer().equals(selfColor))
    {
        for(GameTreeNode child: currentNode.getChildren())
        {
            alpha = Math.max(alpha, miniMax(child, depth - 1, alpha, beta));

            if(alpha >= beta)
            {
                return beta;
            }
        }
        return alpha;
    }
    else
    {
        for(GameTreeNode child: currentNode.getChildren())
        {
            beta = Math.min(beta, miniMax(child, depth - 1, alpha, beta));

            if(alpha >= beta)
            {
                return alpha;
            }
        }
        return beta;
    }
}
//Checks to see if the node is terminal
private boolean terminalNode(GameState state)
{
if(state.getStatus().equals(win) || state.getStatus().equals(lose) || state.getStatus().equals(draw))
    {
        return true;
    }
    else
    {
        return false;
    }
}
Qintar answered 16/3, 2013 at 9:20 Comment(4)
Checkers has a standard starting position and both minimax and minimax with alpha-beta pruning are deterministic algorithms, so every game should play out identically unless you've introduced randomness somewhere. Perhaps this randomness is producing the divergence in outcomes.Tem
Minimax and minimax with alpha-beta are by definintion supposed to produce identical results, only alpha-beta pruning gives you the result somewhat faster, with "somewhat" being determined by how good your move ordering hueristic is. So the way to test your alpha-beta implementation is to run minimax with and without it over a large set of positions and verify that identical results are produced for both versions.Tem
@Kyle I realized it's actually because my minimax algorithm returns a random move from among the equal best moves and my alpha-beta pruning algorithm just returns the first best move considered (because of the way alpha is passed my implementation can't find equal moves). At the start a move to the side of the board scores the same at ply 3, but is actually worse, but it's the first one considered by alpha-beta pruning and therefore is returned. So picking a random move from among the best moves is better than just picking the first one in this case. Thanks for the help.Qintar
@sage88: If you have found the solution to this question, you could answer it by yourself, if you like.Deroo
D
1

You already fixed your problem, but the problem you encountered is pretty common. So whenever you build a part of the algorithm for an AI agent, you have to test it properly. So once your minimax algorithm is correct, you can just generate many random trees and check whether the results are the same. For example in python you can do this in this way:

class Node():
    def __init__(self, data, children):
        self.data = data
        self.children = children

def generateTree(depth, branching):
    total = branching**depth
    values = [randint(-100, 100) for _ in xrange(total)]
    level = [Node(values[i], []) for i in xrange(total)]

    for _ in xrange(depth):
        total /= branching
        level = [Node(None, level[i * branching: (i+1) * branching]) for i in xrange(total)]

    return level[0], values

Now you can generate a tree with many random trees and compare the results.

tree, values = generateTree(depth, branching)
print negamax(tree, depth, 1) == alpha_beta_negamax(tree, depth, float('-inf'), float('inf'), 1)

Do not forget that minimax and alpha-beta return just the best value, whereas what you are interested in a real game is a move. It is straightforward to modify them in such a way that they can return a move, but this is up to a developer to decide how the move is returned. This is because there can be many moves that lead to the best solution (you can return the first one, last one or the most common one is to find all the moves and to return the random one).

In your case the problem was with the randomness of the returned values, so during the testing the good approach is to fix randomness.

Diarmid answered 16/10, 2015 at 2:19 Comment(0)
A
2

I noticed you said you found the problem but shouldnt the minimax alpha beta pruning be

if it is MAX's turn to move
  for child in children
     result = alphaBetaMinimax(child, alpha, beta)
     if result > alpha
        alpha = result
        if node is root
           bestMove = operator of child
     if alpha >= beta
        return alpha
  return alpha

if it is MIN's turn to move
  for child in children
     result = alphaBetaMinimax(child, alpha, beta)
     if result < beta
        beta = result
        if node is root
           bestMove = operator of child
     if beta <= alpha
        return beta
  return beta

you wrote:

  if alpha >= beta
    return beta
return alpha
Ammadas answered 2/9, 2013 at 13:27 Comment(2)
No, you return beta there because it's the cutoff. If alpha exceeds it then you don't want to consider it because the other player would never let you make that move. See the wiki article on alpha beta pruning for more information on this en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning . And I know this is correct code as it's been run against 40 or so other minimax-esque algorithms and placed 2nd overall.Qintar
Nevertheless it's incorrect to return alpha from a min node. A min node always returns its final beta for consideration by its parent max node as a new alpha.Longer
L
2

On March 16, 2013, sage88 asked:

Is there a trick to recovering multiple integer values from recursive calls in a for loop? It works fine with both my minimax and negamax implementations, but alpha-beta pruning seems to produce some strange results.

In alpha beta pruning, the only output value of interest is a node's score: the final value of beta in a min node is considered for the alpha value of its parent max node; likewise, the final value of alpha in a max node is considered for the beta value of its parent min node. Therefore:

The answer to your question is the algorithm itself, as it's the most relevant trick.

That said, there are two errors in your implementation: 1) As Adrian Blackburn originally pointed out, it's incorrectly returning alpha from a min node and vice-versa, thereby skewing its accuracy; 2) It's giving up pruning opportunities by prematurely considering the parent alpha or beta in the current node's value. This version fixes the return values and maximizes pruning:

private int miniMax(GameTreeNode currentNode, int depth, int alpha, int beta) {
    if (depth <= 0 || terminalNode(currentNode.getState())) {
        return getHeuristic(currentNode.getState());
    }
    if (currentNode.getState().getCurrentPlayer().equals(selfColor)) {
        int currentAlpha = -INFINITY;
        for (GameTreeNode child : currentNode.getChildren()) {
            currentAlpha = Math.max(currentAlpha, miniMax(child, depth - 1, alpha, beta));
            alpha = Math.max(alpha, currentAlpha);
            if (alpha >= beta) {
                return alpha;
            }
        }
        return currentAlpha;
    }
    int currentBeta = INFINITY;
    for (GameTreeNode child : currentNode.getChildren()) {
        currentBeta = Math.min(currentBeta, miniMax(child, depth - 1, alpha, beta));
        beta = Math.min(beta, currentBeta);
        if (beta <= alpha) {
            return beta;
        }
    }
    return currentBeta;
}

Thanks for contributing a fun and interesting question :)

For more fun, here's a clarification of your move() method, removing a redundant call to Math.max():

@Override
public GameState move(GameState state) {
    GameState bestMove = null;
    int bestScore = -INFINITY;
    GameTreeNode gameTreeRoot = new GameTreeNode(state);
    for (GameTreeNode child : gameTreeRoot.getChildren()) {
        int alpha = miniMax(child, plyDepth - 1, bestScore, INFINITY);
        if (alpha > bestScore || bestMove == null) {
            bestMove = child.getState();
            bestScore = alpha;
        }
    }
    return bestMove;
}

Finally (even more fun), just a suggestion, a method name change to clarify the intent of terminalNode(), though I would move this into GameState so it could be called with no parameters:

private boolean isTerminal(GameState state) {
    //return Is.any(state.getStatus(), win, lose, draw);
    return state.getStatus().equals(win)
        || state.getStatus().equals(lose)
        || state.getStatus().equals(draw);
}
Longer answered 12/12, 2014 at 0:34 Comment(2)
Hey thanks for posting this. This is a really old project, I'll have to dig it up and take a look.Qintar
Sure thing, it was fun. I wanted to see if I could provide an acceptable answer to your question after all this time :)Longer
L
1

To just answer your question

Is there a trick to recovering multiple integer values from recursive calls in a for loop?

Yes, in Java you would need to pass an object into the recursive function call, then modify the contents of that object. After the function returns you will be able to access the modified values.

Eg.

class ToBeReturned {
    int returnValue1;
    int returnValue2;
    int returnValue3;
}
Loathsome answered 27/5, 2014 at 12:58 Comment(0)
D
1

You already fixed your problem, but the problem you encountered is pretty common. So whenever you build a part of the algorithm for an AI agent, you have to test it properly. So once your minimax algorithm is correct, you can just generate many random trees and check whether the results are the same. For example in python you can do this in this way:

class Node():
    def __init__(self, data, children):
        self.data = data
        self.children = children

def generateTree(depth, branching):
    total = branching**depth
    values = [randint(-100, 100) for _ in xrange(total)]
    level = [Node(values[i], []) for i in xrange(total)]

    for _ in xrange(depth):
        total /= branching
        level = [Node(None, level[i * branching: (i+1) * branching]) for i in xrange(total)]

    return level[0], values

Now you can generate a tree with many random trees and compare the results.

tree, values = generateTree(depth, branching)
print negamax(tree, depth, 1) == alpha_beta_negamax(tree, depth, float('-inf'), float('inf'), 1)

Do not forget that minimax and alpha-beta return just the best value, whereas what you are interested in a real game is a move. It is straightforward to modify them in such a way that they can return a move, but this is up to a developer to decide how the move is returned. This is because there can be many moves that lead to the best solution (you can return the first one, last one or the most common one is to find all the moves and to return the random one).

In your case the problem was with the randomness of the returned values, so during the testing the good approach is to fix randomness.

Diarmid answered 16/10, 2015 at 2:19 Comment(0)
F
0

To achive bets prunning results you should implement some kind of move ordering. In chess it is usually captures or checks. Those kind of moves tend to change evaluation most and so they have great impact on prunning. In checkers it might be taking oponents stones or promoting self stones on 8th rank (sorry do not know the terms used).

Fernand answered 8/12, 2014 at 12:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.