Minimax algorithm with TicTacToe not working properly
Asked Answered
A

1

1

I already posted a similar question here in this forum but since the old post got a little long and I rewrote my algorithm I'm starting this new post. The old post can be found here.


So I'm simply trying to implement a minimax algorithm for my TicTacToe game, except it turned out to be quite hard and even after days of trying to find the mistake, I cannot find it. you can find my code below. First, I have a few definitions, typedefs and declarations:

typedef signed char s8;
typedef unsigned char u8;
typedef s8 score;

#define STATE_00    getBoardState(0, 0)
#define STATE_01    getBoardState(0, 1)
#define STATE_02    getBoardState(0, 2)
#define STATE_10    getBoardState(1, 0)
#define STATE_11    getBoardState(1, 1)
#define STATE_12    getBoardState(1, 2)
#define STATE_20    getBoardState(2, 0)
#define STATE_21    getBoardState(2, 1)
#define STATE_22    getBoardState(2, 2)

typedef enum {
    EPlayerX = 1,
    EPlayerO
} EPlayer;

typedef enum {
    EMinimizing = 0,
    EMaximizing
} EMinMax;

static u8 g_boardState[3][3] = {
    {0, 0, 0,},
    {0, 0, 0,},
    {0, 0, 0,},
};

These are followed by some functions:

u8 getBoardState(u8 row, u8 column);

EPlayer isWon(void)
{
    EPlayer winningBoards[8][3] = {
        {STATE_00, STATE_01, STATE_02},
        {STATE_10, STATE_11, STATE_12},
        {STATE_20, STATE_21, STATE_22},
        {STATE_00, STATE_10, STATE_20},
        {STATE_01, STATE_11, STATE_21},
        {STATE_02, STATE_12, STATE_22},
        {STATE_00, STATE_11, STATE_22},
        {STATE_20, STATE_11, STATE_02},
    };
    u8 i;
    for(i=0; i<8; i++){
        if( (winningBoards[i][0] != 0) &&
            (winningBoards[i][0] == winningBoards[i][1]) &&
            (winningBoards[i][0] == winningBoards[i][2])){
                return winningBoards[i][0];
        }
    }
    return 0;
}

u8 getBoardState(u8 row, u8 column)
{
    return g_boardState[row][column];
}

void setBoardState(u8 row, u8 column, u8 state)
{
    g_boardState[row][column] = state;
}

u8 isDraw(void)
{
    u8 i, j;
    for(i=0; i<3; i++){
        for(j=0; j<3; j++){
            if(getBoardState(i, j) == 0){
                return 0;
            }
        }
    }
    return 1;
}

void dumpTable(score table[3][3])
{
    int i, j;
    for(i=0; i<3; i++) {
        printf("\n");
        for(j=0; j<3; j++){
            printf("%6i ", table[i][j]);
        }
    }
    printf("\n");
}

EPlayer playerSwitch(EPlayer player)
{
    if(player == EPlayerO) return EPlayerX;
    if(player == EPlayerX) return EPlayerO;
    return 0;
}

EMinMax modeSwitch(EMinMax mode)
{
    if(mode == EMaximizing) return EMinimizing;
    if(mode == EMinimizing) return EMaximizing;
    return 0;
}

Then there is the actual minimax algorithm here called scoring:

score scoring(EMinMax mode, EPlayer player, u8 depth)
{
    score thisScore, tempScore;
    if(mode == EMaximizing){
        thisScore = -20;
        if(isWon()) return 15-depth;
    }
    if(mode == EMinimizing){
        thisScore = 20;
        if(isWon()) return depth-15;
    }
    if(isDraw()){
        return 0;
    }

    u8 i, j;
    for(i=0; i<3; i++){
        for(j=0; j<3; j++){
            if(getBoardState(i, j) == 0){
                setBoardState(i, j, player);
                tempScore = scoring(modeSwitch(mode), playerSwitch(player), depth+1);
                if((mode == EMaximizing) && (tempScore > thisScore)){
                    thisScore = tempScore;
                }
                if((mode == EMinimizing) && (tempScore < thisScore)){
                    thisScore = tempScore;
                }
                setBoardState(i, j, 0);
            }
        }
    }

    return thisScore;
}

And finally a function printing the scores in a table as well as the main:

void printSocredBoards(EPlayer player)
{   
    score thisScore[3][3] = {
        {STATE_00, STATE_01, STATE_02},
        {STATE_10, STATE_11, STATE_12},
        {STATE_20, STATE_21, STATE_22},
    };
    int i, j;
    if((isWon() == 0) && (isDraw() == 0)){
        for(i=0; i<3; i++){
            for(j=0; j<3; j++){
                if(getBoardState(i, j) == 0){
                    setBoardState(i, j, player);
                    thisScore[i][j] = scoring(EMaximizing, playerSwitch(player), 0);
                    setBoardState(i, j, 0);
                }
            }
        }
    }
    dumpTable(thisScore);
}

int main(int argc, char **argv)
{

    printSocredBoards(EPlayerO);

    return 0;
}

As far as I know, this algorithm should work fine, however it gives me a nonsensical output:

 7      7      7 
 7      0      7 
 7      7      7 

What am I missing? Thanks in advance for any help.

Augie answered 27/7, 2016 at 15:18 Comment(8)
what is the definition of getBoardState?Deserved
@Deserved it says that a few lines below.Augie
What were you expecting?Vastitude
@Vastitude I expected that the corners should have a higher score. But it's not just with the initial board. If I put some moves in the g_boardState, it's not better. Oftentimes, it does not recommend me to block for example.Augie
Why do you call playerSwitch in the function printSocredBoards ?Vastitude
I would expect something like if(isWon() == player) rather than if(isWon()) isWon() returns non-zero (true) if somebody won.Doublecross
@Vastitude because with the printScoredBoards I set every position on the board to EPlayerO one after the other. Thus I have to then give my scoring function the opposite player since it is no longer O's turn.Augie
@chux player couldn't have the same value as isWon() though because the player previously has been switched. It is also easier that way since scoring is not dependent of what player started.Augie
C
1

I believe that the problem lies with this chunk of code from scoring where your cases are flipped from the correct return values:

if(mode == EMaximizing){
    thisScore = -20;
    if(isWon()) return 15-depth;
}
if(mode == EMinimizing){
    thisScore = 20;
    if(isWon()) return depth-15;
}

Intuitively, the problem is that when scoring reaches this point in the code, the call to isWon is evaluating the result of the previous piece placement, which was made with the other choice for mode.

For example, when scoring is called with EMaximizing and the board state is already won, then that implies that the player who is EMinimizing wins in this state and the returned score should reflect this (i.e. it should be negative). Since depth reaches a maximum of 8 your return value when mode == EMaximizing is always positive, which is not what you want.

When the cases are reversed, your program outputs all zeroes, which seems more sensible as perfect players should always draw. I also tested the code with the following line added to the top of printScoredBoards to hard-code the first play to the top-left corner:

setBoardState(0, 0, playerSwitch(player));

This yields the following:

 0     10     10 
10      0     10 
10     10     10 

Correctly identifying both that the second player cannot choose the top-left and will lose if they play anything other than the center as their opening move.

Circularize answered 28/7, 2016 at 0:23 Comment(4)
Shouldn't the score of the state you are describing be: X -10 -10 -10 0 -10 -10 -10 -10Augie
Also, if I input the following state: O 0 0 0 X 0 0 0 0, shoudn't I get a score that suggests to put the next O in the bottom-right corner rather one that outputs all 0s?Augie
Or was I just misunderstanding the purpose of this algorithm? Is it actually describing the outcome of a game if a player puts this mark on a position and both play perfectly from then on?Augie
For each possible play, you are computing the end result from optimal play (by both players). In the scenario you described above where O played first in upper left, then X played in center, there is no choice for O that subsequently leads to anything other than a draw (assuming optimal play), no?Circularize

© 2022 - 2024 — McMap. All rights reserved.