Genetic algorithm for a card game (Dominion)
Asked Answered
H

3

5

I have a working F# program that runs Dominion, a card game. I would like to use a genetic algorithm to determine optimal strategies for playing. However, I don't know much about AI or genetic algorithms. Can you point me towards some good literature to get started?

A strategy for playing consists of a reaction to a given hand. In each turn, a bot is dealt a hand of cards. It can choose to play action cards, or buy new cards, based on what it has been dealt. The goal is to end the game with as many victory point cards as possible.

A hardcoded approach could look something like:

def play(hand, totalDeck):
    if hand contains Smithy then use Smithy
    if hand contains enough coins for Province then buy Province
    if more than 30% of the totalDeck is Smithy, then buy coins

I was thinking of describing a strategy in terms of a vector of target portions of the total deck for each card:

[Smithy, Province, Copper, ...]
[.3, .2, .1, ...]

Then to mutate a bot, I could just change that vector around, and see if the mutated version does better. The fitness function would be the average score playing Dominion against a variety of other bots. (One bot's score is dependent on who it is playing against, but hopefully by playing many games against many bots this can even out.)

Does this make sense? Am I headed down the right path?

Hefner answered 3/6, 2012 at 3:54 Comment(3)
Sorry, but IMO that's a really bad description of your problem. I don't even know why you want to "combine" two bots or what of the bots you want to combine. I assume action cards are a dynamic property that change during play. Please state the problem more clearly in terms of an objective function and your decision variables. I assume you want to train some parameters of a generic bot that you wrote. Maybe you can elaborate this a little more. What kind of programming language did you write the simulator of the card game?Quianaquibble
I agree that I wasn't phrasing the problem very well. I've tried again; how does this look?Hefner
definitely worthy to spend a little extra amount of timeQuianaquibble
P
5

Dominion's a great game, but it will be hard to optimize using a genetic algorithm, as the inputs of any given game differ between games (card-sets used), the optimal strategy changes over the course of the game, and the optimal play for any given situation is only slowly going to appear in a genetic search (intuitively, based on my pretty-good understanding of both GAs and the game).

A better approach to Dominion, I think, would be either a straight heuristic (rule-based) approach or, very interestingly, Monte Carlo Search (see, for instance, http://cacm.acm.org/magazines/2012/3/146245-the-grand-challenge-of-computer-go/fulltext). Monto Carlo Search is appealing precisely because:

  • It's easy to generate a random-but-legal sequence of moves in Dominion.
  • It's at least straight-forward to judge the "value" of such a sequence (increase in VP)
  • A-priori building of "best play" rules is hard (that's what makes the game so good)

It's a very good challenge -- you should blog your experiences.

Parchment answered 4/6, 2012 at 18:54 Comment(0)
P
4

Where do you draw the other bots from? Do you keep them static? If so, the trained bot won't become 'good' at the game per se, just good at exploiting the dummy bot. If not, then the other bots evolve too, and the win percentage will not be a good indicator of quality unless some other constraints apply. Always realize that with a population full of bots with perfect skill, their performance against each other will appear mediocre!

You could take a co-evolutionary approach:

  • Mutate all bots in a sufficiently large populations.
  • Let them compete against each other repeatedly in a round robin tournament, eg 100 times
  • Eliminate some of the worst performing bots,
  • Keep a few of the best bots unchanged (elitism)
  • Refill the rest of the population with mutations and crossovers of good bots.

Or you could train against a fixed benchmark:

  • Make a bot with a handcrafted policy that appears good, with your knowledge of the game
  • Alternatively, have human players (yourself?) provide the moves. This might be a good source of training experience for your bot, but unless you have access to a large database of (expert) human moves, it is very slow.
  • Train against your benchmark
  • Select the best performers, mutate, etc
Presage answered 3/6, 2012 at 15:27 Comment(0)
Q
2

I think the vector will not really lead to good results unless the game is very linear. I would suggest the following rule-based approach:

In each turn you hold a set of cards and you want to determine the card that you play or in case you don't play a card that you want to draw a new one. What you need is some kind of priority function that tells you which card is the best to play. Such a priority function can be described by genetic programming. You would always play the card with the highest priority unless no card has a priority above a set level (e.g. 0) in which case you draw a new one. Genetic programming can be used to evolve that priority function.

Since you're using F# it might be a good idea to try this approach with HeuristicLab which is written in C#. You can implement your program as a problem there and let the evaluation function perform a simulation of that game. Write a grammar and an interpreter for your rule. Define a number of parameters that will allow your priority rule to make meaningful decisions e.g. some general game information such as number of cards played, provinces left etc. and some card-related information such as the impact of playing that card (in terms of win-points) etc.. These are the input variables to your interpreter which will calculate a priority value. The card with the best priority value is the one you choose. Then to evaluate your strategy define e.g. 10 random solutions when you create such a problem and in the evaluation let each solution compete against that random bots. Measure the amount to which you beat each of the random bots, the higher you win and the more bots you beat the better the fitness, the more they win the worse the score. You can then also add an analyzer to your problem which will update the problem's random solutions with the best performing ones so you also evolve these over time.

You can also use the target-portion idea if you like. In that case your encoding would be a RealVector. You can also optimize that with HeuristicLab (use Evolution Strategy, Particle Swarm Optimization or Genetic Algorithm).

Quianaquibble answered 3/6, 2012 at 20:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.