Return bestMove in minimax algorithm for tictactoe
Asked Answered
M

3

6

I have tried to code the minimax algorithm for tic-tac-toe given in Russel Norvig's book on Artificial Intelligence. It had everything except that the way to return the bestMove to the user. I am trying hard to return the bestMove, but cannot decide when to choose the bestMove. Help, anyone?

moveT MiniMax(stateT state)
{
    moveT bestMove;

    max_move(state,bestMove);

    return bestMove;

}

int max_move(stateT state,int & bestMove)
{
    int v = -10000;
    if(GameIsOver(state))
    {
        return EvaluateStaticPosition(state);

    }

    vector<moveT> moveList;
    GenerateMoveList(state, moveList);
    int nMoves = moveList.size();

    for(int i = 0 ; i < nMoves ; i++)
    {
        moveT move = moveList[i];
        MakeMove(state, move);

        int curValue = min_move(state,bestMove);

            if(curValue > v)
            {
              v = curValue;
              bestMove = move;
            }
        RetractMove(state, move);

    }

    return v;

}

int min_move(stateT state, int &bestMove)
{
    int v = 10000;
    if(GameIsOver(state))
    {
      return EvaluateStaticPosition(state);

    }
    vector<moveT> moveList;
    GenerateMoveList(state, moveList);

    int nMoves = moveList.size();

    for(int i = 0 ; i < nMoves; i++)
    {
        moveT move = moveList[i];
        MakeMove(state, move);

        int curValue = max_move(state,depth+1,bestMove);

            if(curValue < v)
            {
              curValue = v;
            }
        RetractMove(state, move);

    }
    return v;
}

P.S.: There are other pseudocode for finding the minmax value. However, they are focused on tic-tac-toe only, I am trying to extend it to other games. Thanks.

Update : The whole code can be found here : http://ideone.com/XPswCl

Ml answered 23/11, 2012 at 4:20 Comment(5)
Is the code you posted above up-to-date? Because it doesn't look like it should compile. In min_move you call max_move with three arguments, but max_move can only take two arguments.Chambray
@Kevin: Oops , it is now updated . I tried to limit the depth at some point .Ml
Thanks for updating, but the incorrect line is still there: int curValue = max_move(state,depth+1,bestMove); This is causing me to worry; it makes me suspect that the code you're posting is not the code you're compiling. That makes it doubly challenging for potential answerers to find the problem. We'll identify bugs in the posted code that aren't there in the real code, and we won't be able to find bugs in the real code if they aren't in the posted code.Chambray
See the update , please , sorry for the confusion ; the whole code is here : ideone.com/XPswClMl
Thanks for posting the whole code. For the benefit of other answerers, here is an example of the computer playing imperfectly: Choose moves 5 and then 7. The computer should put its second piece in the top right slot to block your diagonal win, but it instead chooses the top left slot.Chambray
C
7

In the simplest version of minimax, the first player wishes to maximize his score, and the second player wishes to minimize the first player's score. Since both first and second player only care about the first player's score, EvaluateStaticPosition should return a value indicating how good the board state is for the first player. Whose turn it is is not relevant.

int EvaluateStaticPosition(stateT state)
{
        if(CheckForWin(state, FIRST_PLAYER))
        {
                return WINNING_POSITION;
        } 
        if(CheckForWin(state, Opponent(FIRST_PLAYER)))
        {
                return LOSING_POSITION;
        } 
        return NEUTRAL_POSITION;
}

Now, when you want the move that's best for the first player, call MaxMove. When you want the move that's best for the second player, call MinMove.

moveT MiniMax(stateT state)
{
    moveT bestMove;
    int i = 0;
    if (state.whoseTurn == FIRST_PLAYER){
        i = MaxMove(state, bestMove);
    }
    else{
        i = MinMove(state,bestMove);
    }
    cout<<"i is "<<i<<endl;
    return bestMove;
}

Finally, you have some problems inside of MinMove and MaxMove. when you assign curRating in either one, you shouldn't pass in bestMove as the second argument to MaxMove or MinMove. It will then put the opponent's best move into bestMove, which doesn't make sense. Instead, declare an opponentsBestMove object and pass that as the second argument. (You won't actually be using the object or even looking at its value afterwards, but that's ok). With that change, you never assign anything to bestMove within MinMove, so you should do so inside the if(curRating < v) block.

int MaxMove(stateT state, moveT &bestMove)
{
        if(GameIsOver(state))
        {
            return EvaluateStaticPosition(state);
        }
        vector<moveT> moveList;
        GenerateMoveList(state, moveList);
        int nMoves = moveList.size();
        int v = -1000;
        for(int i = 0 ;i<nMoves; i++)
        {
                moveT move = moveList[i];
                MakeMove(state, move);
                moveT opponentsBestMove;
                int curRating = MinMove(state, opponentsBestMove);
                if (curRating > v)
                {
                        v = curRating;
                        bestMove = move;
                }
                RetractMove(state, move);
        }
        return v;

}
int MinMove(stateT state,  moveT &bestMove)
{
        if(GameIsOver(state))
        {
                return EvaluateStaticPosition(state);
        }
        vector<moveT>moveList;
        GenerateMoveList(state, moveList);
        int nMoves = moveList.size();
        int v = 1000;
        for(int i = 0 ; i<nMoves; i++)
        {
                moveT move = moveList[i];
                MakeMove(state , move);
                moveT opponentsBestMove;
                int curRating = MaxMove(state,opponentsBestMove);
                if(curRating < v)
                {
                        v = curRating;
                        bestMove = move;
                }
                RetractMove(state, move);
        }
        return v;
}

At this point you should have an unbeatable AI!

The final position looks like this:

 O | O | X
---+---+---
 X | X | O
---+---+---
 O | X | X

Cat's game.

An alternative method takes advantage of the fact that tic-tac-toe is a zero-sum game. In other words, at the end of the game, the sum of the scores of the players will equal zero. For a two player game, this means that one player's score will always be the negative of the other player's. This is convenient for us, since minimizing the other player's score is then identical to maximizing one's own score. So instead of one player maximizing his score and one player minimizing the other player's score, we can just have both players attempt to maximize their own score.

Change EvaluateStaticPosition back to its original form, so that it gives a score based on how good the board state is for the current player.

int EvaluateStaticPosition(stateT state)
{
        if(CheckForWin(state, state.whoseTurn))
        {
                return WINNING_POSITION;
        }
        if(CheckForWin(state, Opponent(state.whoseTurn)))
        {
                return LOSING_POSITION;
        }
        return NEUTRAL_POSITION;
}

Delete MinMove, since we only care about maximizing. Rewrite MaxMove so that it chooses the move that gives the opponent the worst possible score. The score for the best move is the negative of the other player's worst score.

int MaxMove(stateT state, moveT &bestMove)
{
        if(GameIsOver(state))
        {
                return EvaluateStaticPosition(state);
        }
        vector<moveT> moveList;
        GenerateMoveList(state, moveList);
        int nMoves = moveList.size();
        int v = -1000;
        for(int i = 0 ;i<nMoves; i++)
        {
                moveT move = moveList[i];
                MakeMove(state, move);
                moveT opponentsBestMove;
                int curRating = -MaxMove(state, opponentsBestMove);
                if (curRating > v)
                {
                        v = curRating;
                        bestMove = move;
                }
                RetractMove(state, move);
        }
        return v;

}

Since MaxMove is used for both players, we no longer need to distinguish among players in the MiniMax function.

moveT MiniMax(stateT state)
{
    moveT bestMove;
    int i = 0;
    i = MaxMove(state, bestMove);
    cout<<"i is "<<i<<endl;
    return bestMove;
}
Chambray answered 28/12, 2012 at 14:34 Comment(3)
If I am not mistaken , did you put bestMove = move out in the MinMove intentionally when curRating < v .Ml
Yes, I did intentionally put bestMove = move in the MinMove function, inside the curRating < v block. That is the logical place for it; if a move's rating is the lowest found so far, then that move is the best move. At least until another even lower rating is found.Chambray
Man that works , I wish I could give you a hug ; wherever you are ,thanks . Let me test further , wait .Ml
B
4

Well, it looks like MiniMax correctly chooses it for you, just call it with an initial state and a depth. (Unless the first player according to the state is the second player, then you should call min_move in MiniMax.)

EDIT: yes, I overlooked something, bestMove currently does not make much sense. In the program within max_move you change the loop like this:

for(int i = 0 ; i < nMoves ; i++)
{
    moveT move = moveList[i];
    MakeMove(state, move);

    int new_value = min_move(state, depth+1);
    if(new_value > v)
    {
      v=new_value;
    }
    RetractMove(state, move);

}

After that you can think about you what bestMove means? My idea is that you are interested in finding one of the "best possible" series of moves for tic-tac-toe. For that you need a vector or even better a stack. But that also means having std::stack<int>* best_moves as the last parameter.

For the stack implementation, in min_move you return the next moves and if their value is the best, you will push your move on the top of the best_moves stack. Of course at the end of the game you just return the empty stack. It takes an OOP approach to pull it off properly, I'll do it when I have some time.

If all you need is merely the best next move then I suggest you change the return types of min_move and max_moe to some struct like this:

struct Value_move{
  int value;
  moveT best_move;
};

Then the new implementation of max_move looks like the following:

const int MOVE_INVALID = -12345;
const int MOVE_NOTHING = -12346;

Value_move max_move(stateT state, int depth)
{
    Value_move best;
    best.value = -10000; best.best_move = MOVE_INVALID;

    if(GameIsOver(state))
    {
        best.value = EvaluateStaticPosition(state);
        best.best_move = MOVE_NOTHING;
        return best;
    }

    vector<moveT> moveList;
    GenerateMoveList(state, moveList);
    int nMoves = moveList.size();

    for(int i = 0 ; i < nMoves ; i++)
    {
        moveT move = moveList[i];
        MakeMove(state, move);
        Value_move curr = min_move(state, depth+1);
        if(curr.value > best.value)
        {
            best.value = curr.value;
            best.best_move = move;
        }
        RetractMove(state, move);

    }

    return v;

}

You only need to pick up the best_move field in the returned struct in the MiniMax function.

REMARK:
You have to admit though this does not resemble a c++ program in many aspects but rather a c program. Otherwise, all the functions in CapitalCamelCase should be class methods, you should pass states by (const) ref instead of value -- but this whole code makes sense only if the status is really a pointer behind a typedef.

Beverlybevers answered 23/11, 2012 at 5:46 Comment(4)
true, corrected, sorry, +1. I hope we agree on bad practice for a c++ program here, though. This code is somewhere in-between. It also started to use references at some places. :)Beverlybevers
@BarnabasSzabolcs I am concerned about curValue > v , is it rightly placed in the for loop . curValue is uninitialized ,it is not possible for it to be greater than v , the maximum value obtained so far , say +10 . How, can I change the code so that the 'curValue' represents v = max(v,min_move(state, depth+1,bestMove)) and also a way for to store and compare the best value obtained so far with curValue . I'm little bit fuzzy here .Ml
I tried now to tidy up the language and make it more to the point, I am interested in your feedback, please let me know if something is not clear, maybe I can improve and explain it better.Beverlybevers
Thanks @BarnabasSzabolcs : my problem is solved , with a little fuss only .Ml
W
0

Your code finds the correct value but then overwrites it by passing the same reference down.

int curValue = min_move(state,bestMove);

should become

moveT nextMove; // No need to actually do anything with this value
int curValue = min_move(state,nextMove);

You also need to make the same kind of change in your min_move function.

NB: in min_move your code calls max_move with more arguments than you've defined for the function.

Walley answered 27/12, 2012 at 6:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.