Problems with Alpha Beta for 2048
Asked Answered
B

1

6

I'm writing an AI for the game 2048 using Python. It's going a lot slower than I expected. I set the depth limit to just 5 and it still took several seconds to get an answer. At first I thought my implementations of all the functions were crap, but I figured out the real reason why. There are way more leaves on the search tree than there even possibly should be.

Here is a typical result (I counted the leaves, branches, and number of expansions):

111640 leaves, 543296 branches, 120936 expansions
Branching factor: 4.49242574585
Expected max leaves = 4.49242574585^5 = 1829.80385192 leaves

and another, for good measure:

99072 leaves, 488876 branches, 107292 expansions
Branching factor: 4.55650001864
Expected max leaves = 4.55650001864^5 = 1964.06963743 leaves

As you can see, there are way more leaves on the search tree than how many there would be if I used naive minimax. What is going on here? My algorithm is posted below:

# Generate constants
import sys
posInfinity = sys.float_info.max
negInfinity = -sys.float_info.max

# Returns the direction of the best move given current state and depth limit
def bestMove(grid, depthLimit):
    global limit
    limit = depthLimit
    moveValues = {}
    # Match each move to its minimax value
    for move in Utils2048.validMoves(grid):
        gridCopy = [row[:] for row in grid]
        Utils2048.slide(gridCopy, move)
        moveValues[move] = minValue(grid, negInfinity, posInfinity, 1)
    # Return move that have maximum value
    return max(moveValues, key = moveValues.get)

# Returns the maximum utility when the player moves
def maxValue(grid, a, b, depth):
    successors = Utils2048.maxSuccessors(grid)
    if len(successors) == 0 or limit < depth:
        return Evaluator.evaluate(grid)
    value = negInfinity
    for successor in successors:
        value = max(value, minValue(successor, a, b, depth + 1))
        if value >= b:
            return value
        a = max(a, value)
    return value
# Returns the minimum utility when the computer moves
def minValue(grid, a, b, depth):
    successors = Utils2048.minSuccessors(grid)
    if len(successors) == 0 or limit < depth:
        return Evaluator.evaluate(grid)
    value = posInfinity
    for successor in successors:
        value = min(value, maxValue(successor, a, b, depth + 1))
        if value <= a:
            return value
        b = min(b, value)
    return value

Someone please help me out. I looked over this code several times and I can't pin down what's wrong.

Bookworm answered 17/6, 2014 at 4:59 Comment(7)
you need to provide an example call to it ... but try profiling it by following the directions here vrplumber.com/programming/runsnakerun to see if you can identify the bottlenecksFete
Never mind the large number of leaves, how can the branching factor be larger than 4 if there are only 4 moves in the game of 2048?Presignify
Please show us how you calculate the leaves, branches and expansions and we may help you.Dihedral
Here is a paper that disccuss the game, maybe it could be useful cs.put.poznan.pl/mszubert/pub/szubert2014cig.pdfHogwash
The branching factor is higher than 4 because this is adversarial with one player doing swipes ( 4 branches ) and another player dropping new tiles which can have the values 2 or 4 in one of 4 slots ( 8 branches ). So, for 1 look-ahead (which includes calculating both parties), the branching factor is 32. The above code uses depth (half look-aheads) of 4, which is 32*32 = 1024, precisely.Vitrine
@occams-blade , What output do you get for simple known unit tests? For example, if you have a matrix of just 2 tiles each with a value of 4 on the bottom row does your algorithm produce the correct swipes for depth = 1 and depth = 2? does it have the correct branching factor? What if you log all the moves+counter moves for unit test? Is it actually calculating to the expected depth, or are you off-by-one because maybe limit < depth should be limit <= depth.Vitrine
Did you solve that problem in between? What do you think about posting a community wiki answer? - if you didn't solve it: Please make the code follow PEP8 and add docstrings which describe the expected input / output (e.g. numpydoc style)Arpent
G
0

Apparently, you are comparing the value with b(beta) and a(alpha). This comparison in your code is as follows:

def maxValue(grid, a, b, depth):
    .....
    .....
        if value >= b:
            return value
        a = max(a, value)
    return value

and

def minValue(grid, a, b, depth):
    .....
    .....
        if value <= a:
            return value
        b = min(b, value)
    return value

However, the condition for alpha-beta pruning is that anytime alpha grows more than beta, i.e., alpha > beta, we don't need to traverse the search tree.

Therefore, it should be:

def maxValue(grid, a, b, depth):
    ....
    .....
        a = max(a, value)
        if a > b:
            return value

    return value

and

def minValue(grid, a, b, depth):
    .....
    .....
        b = min(b, value)
        if b < a:
            return value

    return value

This is the edge case you were missing as a(alpha) and b(beta) is not always necessarily equal to value.

Griseofulvin answered 13/7, 2019 at 11:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.