Alpha-beta can be seen as an instance of minimax, where min
and max
are instantiated using a well-chosen lattice. Full gist.
We represent games as a tree, where each internal node is a position in the game, waiting for a designated player to pick a move to a child node, and each leaf is a final position with its score, or value.
-- | At every step, either the game ended with a value/score,
-- or one of the players is to play.
data GameF a r = Value a | Play Player (NonEmpty r)
deriving Functor
type Game a = Fix (GameF a)
-- | One player wants to maximize the score,
-- the other wants to minimize the score.
data Player = Mini | Maxi
minimax
will work on any lattice, defined by the following class:
class Lattice l where
inf, sup :: l -> l -> l
The Lattice
class is more general than Ord
: and Ord
instance is a Lattice
with decidable equality (Eq
). If we could redefine Ord
, then it would be appropriate to add Lattice
as a superclass. But here a newtype will have to do:
-- The Lattice induced by an Ord
newtype Order a = Order { unOrder :: a }
deriving (Eq, Ord)
instance Ord a => Lattice (Order a) where
inf = min
sup = max
Here's minimax. It is parameterized by an embedding leaf :: a -> l
of final values to the chosen lattice. One player maximizes the embedded value, the other player minimizes it.
-- | Generalized minimax
gminimax :: Lattice l => (a -> l) -> Game a -> l
gminimax leaf = cata minimaxF where
minimaxF (Value x) = leaf x
minimaxF (Play p xs) = foldr1 (lopti p) xs
lopti :: Lattice l => Player -> l -> l -> l
lopti Mini = inf
lopti Maxi = sup
The "regular" minimax uses the scores of the game directly as the lattice:
minimax :: Ord a => Game a -> a
minimax = unOrder . gminimax Order
For alpha-beta pruning, the idea is that we can keep track of some bounds on the optimal score, and this allows us to short-circuit the search. So the search is to be parameterized by that interval (alpha, beta)
. This leads us to a lattice of functions Interval a -> a
:
newtype Pruning a = Pruning { unPruning :: Interval a -> a }
An interval can be represented by (Maybe a, Maybe a)
to allow either side to be unbounded. But we shall use better named types for clarity, and also to leverage a different Ord
instance on each side:
type Interval a = (WithBot a, WithTop a)
data WithBot a = Bot | NoBot a deriving (Eq, Ord)
data WithTop a = NoTop a | Top deriving (Eq, Ord)
We will require that we can only construct Pruning f
if f
satisfies clamp i (f i) = clamp i (f (Bot, Top))
, where clamp
is defined below. That way, f
is a search algorithm which may shortcircuit if it learns that its result lies outside of the interval, without having to find the exact result.
clamp :: Ord a => Interval a -> a -> a
clamp (l, r) = clampBot l . clampTop r
clampBot :: Ord a => WithBot a -> a -> a
clampBot Bot x = x
clampBot (NoBot y) x = max y x
clampTop :: Ord a => WithTop a -> a -> a
clampTop Top x = x
clampTop (NoTop y) x = min y x
Functions form a lattice by pointwise lifting. And when we consider only functions satisfying clamp i (f i) = clamp i (f (Bot, Top))
and equate them modulo a suitable equivalence relation (Pruning f = Pruning g
if clamp <*> f = clamp <*> g
), a short-circuiting definition of the lattice becomes possible.
The inf
of two functions l
and r
, given an interval i = (alpha, beta)
, first runs l (alpha, beta)
to obtain a value vl
.
If vl <= alpha
, then it must be clamp i vl == alpha == clamp i (min vl (r i))
so we can stop and return vl
without looking at r
. Otherwise, we run r
, knowing that the final result is not going to be more than vl
so we can also update the upper bound passed to r
. sup
is defined symmetrically.
instance Ord a => Lattice (Pruning a) where
inf l r = Pruning \(alpha, beta) ->
let vl = unPruning l (alpha, beta) in
if NoBot vl <= alpha then vl else min vl (unPruning r (alpha, min (NoTop vl) beta))
sup l r = Pruning \(alpha, beta) ->
let vl = unPruning l (alpha, beta) in
if beta <= NoTop vl then vl else max vl (unPruning r (max (NoBot vl) alpha, beta))
Thus we obtain alpha-beta as an instance of minimax. Once the lattice above is defined, we only need some simple wrapping and unwrapping.
alphabeta :: Ord a => Game a -> a
alphabeta = runPruning . gminimax constPruning
constPruning :: a -> Pruning a
constPruning = Pruning . const
runPruning :: Pruning a -> a
runPruning f = unPruning f (Bot, Top)
If all goes well, alphabeta
and minimax
should have the same result:
main :: IO ()
main = quickCheck \g -> minimax g === alphabeta (g :: Game Int)
Bool
represent the current player? – Immersionism