Twist on tic tac toe
Asked Answered
D

2

6

i'm writing a tictactoe program but its not your traditional tictactoe

First of all the board is 4x4 and the way to win is to get 3 of a kind and 1 of your opponents in a row, column, or diagonal. So the following would be a win for "O" via first column:

O|_|X|_
O|X|_|_
O| |_|_
X|_|_|_

I'm trying to implement a minimax algorithm in order to give the program a "hard" mode that can't be beaten.

My problem is that i cannot hope to create a tree with all the possible game states, and therefore i have to come up with some kind of function that evaluates the game states that i can generate.

I guess my question is, how can i come up with such a function?

Discontinuous answered 21/10, 2012 at 1:58 Comment(3)
the first step is to identify/formulate a winning strategy(use words to describe the decision process needed to guarantee a win).Sporocyst
"3 of a kind and 1 of your opponents..." So the O player could win with a row like OOXO, not just OOOX or XOOO? Also, is the use of minimax a requirement for solving the problem or would you welcome other approaches?Ashlieashlin
any approach really would work, i just wanted to try using minimax, but i've already spent 4 hours and i haven't really gotten anywhere. Now i'm just using a bunch of if statements : \. And yes you are correct in your thinking of the different combinations for winning the game.Discontinuous
H
4

This game is definitely small enough for brute force.

You can just enumerate all the states. There are 16 squares, with 3 possible values for each square (X, O, or empty).

3^16 = 43046721, about 43 million.

This means a table with one byte to describe the winnability of each state would only be 43 megabytes.

I'd create a function mapping each state to an index between one and 43 million (you only need the states, not the possible orders of play), basically representing it as a number in base-3, and allowing you to create the state from an index.

Pick 4 winnability values each state could take - winnable for O, winnable for X, not winnable, and unknown.

Allocate a buffer of length 43046721 to store the winnability of each game state.

Iterate through all the index numbers, marking the won states. Then go through and iteratively fill out the winnability of each of the rest of the states, if known (check all successor states, based on who's turn it is). This will take at most 16 iterations over the set of indices, so I don't see any reason that brute force wouldn't work here.

There are optimizations, like taking advantage of symmetry, taking advantage of the fact that all states with n pieces down are succeeded by states with n+1 pieces, etc, but I don't think you need those at first.

Haematocryal answered 21/10, 2012 at 7:35 Comment(2)
Some time ago your implementation would need 30 floppy disks. Do you really think this is the right approach? I think OP would learn much more by investigating some time (not only 4 hours) in game tree search.Uncharitable
I think studying game tree search is a great thing to do, but I also think this is the easiest and most straight-forward way to solve this problem, and that implementing something like this, and then comparing to various optimizations and more intelligent search strategies is a great way to learn about game search.Haematocryal
N
2

A Heuristic function for game is one that evaluates a given state of the game. In here - the state is basically composed of two parts: (1) The board itself. (2) Whose turn it is.

Some possible heuristic function:

  1. Maximum number of X's (or O's, according to the evaluated player) in line/column/diagonal
  2. Number of "Almost winning" stricts (with one missing move) - can be affective to maximize number of winning possibilities

I assume one can think of more heuristics.
You can combine your different heuristics into one "big" heuristic function as follows:

a_1 * h_1(state) + a_2 * h_2(state) + ... + a_n * h_n(state)

The tricky part will be to learn the scores for a_1,...,a_n - this can be done in various ways - one of them is monte carlo learning - which basically means: create a various agents with various a_1,..,a_n values, run a tournament between them, and when the tournament is done - adjust the weights according to the winners - and repeat the process while you still have time (it is an anytime algorithm).
Once you are done, use the learned weights for your final agent.

P.S. Number of possible games is ~ 16! (need to determine the order of squares chosen - it chooses how the rest of the game will end) - ask yourself if it is "small" enough to be developed within your constraints - or is it too much and a heuristical solution is indeed needed.

Nidifugous answered 21/10, 2012 at 7:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.