Genetic Programming in Haskell

If you're looking for something akin to S-expressions, this is fairly easily modeled in Haskell. Say, for example, we want to represent simple algebraic equations with variables, such as

x + 5 / (y * 2 - z)

This can be represented by a simple AST in Haskell, in particular we can implement it as

data Expr
    = Lit Double        -- Literal numbers
    | Var Char          -- Variables have single letter names
    | Add Expr Expr     -- We can add things together
    | Sub Expr Expr     -- And subtract them
    | Mul Expr Expr     -- Why not multiply, too?
    | Div Expr Expr     -- And divide
    deriving (Eq)

This would let us express the equation above as

Add (Var 'x') (Div (Lit 5) (Sub (Mul (Var 'y') (Lit 2)) (Var 'z')))

But this is rather clunky to write and difficult to read. Let's start by working some Show magic so that it gets pretty printed:

instance Show Expr where
    showsPrec n (Lit x)   = showParen (n > 10) $ showsPrec 11 x
    showsPrec n (Var x)   = showParen (n > 10) $ showChar x
    showsPrec n (Add x y) = showParen (n >  6) $ showsPrec 7 x . showString " + " . showsPrec 7 y
    showsPrec n (Sub x y) = showParen (n >  6) $ showsPrec 7 x . showString " - " . showsPrec 7 y
    showsPrec n (Mul x y) = showParen (n >  7) $ showsPrec 8 x . showString " * " . showsPrec 8 y
    showsPrec n (Div x y) = showParen (n >  7) $ showsPrec 8 x . showString " / " . showsPrec 8 y

If you don't understand everything going on here, that's ok, it's a lot of complication made easy by some built in functions for efficiently building strings with parentheses in them properly and all that fun stuff. It's pretty much copied out of the docs in Text.Show. Now if we print out our expression from above, it'll look like

x + 5.0 / (y * 2.0 - z)

Now for simplifying building these expressions. Since they're pretty much numeric, we can implement Num (except for abs and signum) and Fractional:

instance Num Expr where
    fromInteger = Lit . fromInteger
    (+) = Add
    (-) = Sub
    (*) = Mul
    abs = undefined
    signum = undefined

instance Fractional Expr where
    (/) = Div
    fromRational = Lit . fromRational

Now we can input out expression from above as

Var 'x' + 5 / (Var 'y' * 2 - Var 'z')

This is at least much easier to visually parse, even if we have to specify the variables manually.

Now that we have pretty input and output, let's focus on evaluating these expressions. Since we have variables in here, we'll need some sort of environment that associates a variable with a value

import Data.Map (Map)
import qualified Data.Map as M

type Env = Map Char Double

And now it's just basic pattern matching (along with a helper function)

import Control.Applicative

binOp :: (Double -> Double -> Double) -> Expr -> Expr -> Env -> Maybe Double
binOp op x y vars = op <$> evalExpr x vars <*> evalExpr y vars

evalExpr :: Expr -> Env -> Maybe Double
evalExpr (Lit x)   = const $ Just x
evalExpr (Var x)   = M.lookup x
evalExpr (Add x y) = binOp (+) x y
evalExpr (Sub x y) = binOp (-) x y
evalExpr (Mul x y) = binOp (*) x y
evalExpr (Div x y) = binOp (/) x y

Now we can use evalExpr to evaluate an expression in our mini-language with variable substitution. All the error handling if there's an undefined variable is done by the Maybe monad, and we were even able to cut down on repetition by making the environment argument implicit. This is all pretty standard for a simple expression DSL.

So for the fun bit, generating random expressions and (eventually) mutations. For this, we'll need System.Random. We want to be able to generate expressions of varying complexity, so we'll express it in rough depth. Since it's random, there is a chance that we'll get shorter or deeper trees than specified. This will probably be something that you'll want to tweak and tune to get your desired results. First, because I have the foresight of having written this code already, let's define two helpers for generating a random literal and a random variable:

randomLit, randomVar :: IO Expr
randomLit = Lit <$> randomRIO (-100, 100)
randomVar = Var <$> randomRIO ('x',  'z')

Nothing exciting here, we get doubles between -100 and 100, and up to 3 variables. Now we can generate large expression trees.

generateExpr :: Int -> IO Expr
-- When the depth is 1, return a literal or a variable randomly
generateExpr 1 = do
    isLit <- randomIO
    if isLit
        then randomLit
        else randomVar
-- Otherwise, generate a tree using helper
generateExpr n = randomRIO (0, 100) >>= helper
    where
        helper :: Int -> IO Expr
        helper prob
            -- 20% chance that it's a literal
            | prob < 20  = randomLit
            -- 10% chance that it's a variable
            | prob < 30  = randomVar
            -- 15% chance of Add
            | prob < 45  = (+) <$> generateExpr (n - 1) <*> generateExpr (n - 1)
            -- 15% chance of Sub
            | prob < 60  = (-) <$> generateExpr (n - 1) <*> generateExpr (n - 1)
            -- 15% chance of Mul
            | prob < 75  = (*) <$> generateExpr (n - 1) <*> generateExpr (n - 1)
            -- 15% chance of Div
            | prob < 90  = (/) <$> generateExpr (n - 1) <*> generateExpr (n - 1)
            -- 10% chance that we generate a possibly taller tree
            | otherwise = generateExpr (n + 1)

The bulk of this function is just specifying the probabilities that a given node will be generated, and then generating the left and right nodes for each operator. We even got to use the normal arithmetic operators since we overloaded Num, how handy!

Now, remember that we can still pattern match on the constructors of this Expr type for other operations, such as replacing nodes, swapping them, or measuring the depth. For this, you just have to treat it as any other binary tree type in Haskell, except it has 2 leaf constructors and 4 node constructors. As for mutations, you'll have to write code that traverses this structure and chooses when to swap out nodes and what to swap them out with. It'll live within the IO monad since you'll be generating random values, but it shouldn't be too difficult. This structure should be pretty easy to extend as need be, such as if you wanted to add trig functions and exponentiation, you'd just need more constructors, more expressions in evalExpr, and the appropriate clauses in helper, along with some probability adjustments.

You can get the full code for this example here. Hope this helps you see how to formulate something like S-expressions in Haskell.

Recommended topics

Hot tags