Trying to apply CPS to an interpreter
Asked Answered
C

1

12

I'm trying to use CPS to simplify control-flow implementation in my Python interpreter. Specifically, when implementing return/break/continue, I have to store state and unwind manually, which is tedious. I've read that it's extraordinarily tricky to implement exception handling in this way. What I want is for each eval function to be able to direct control flow to either the next instruction, or to a different instruction entirely.

Some people with more experience than me suggested looking into CPS as a way to deal with this properly. I really like how it simplifies control flow in the interpreter, but I'm not sure how much I need to actually do in order to accomplish this.

  1. Do I need to run a CPS transform on the AST? Should I lower this AST into a lower-level IR that is smaller and then transform that?

  2. Do I need to update the evaluator to accept the success continuation everywhere? (I'm assuming so).

I think I generally understand the CPS transform: the goal is to thread the continuation through the entire AST, including all expressions.

I'm also a bit confused where the Cont monad fits in here, as the host language is Haskell.

Edit: here's a condensed version of the AST in question. It is a 1-1 mapping of Python statements, expressions, and built-in values.

data Statement
    = Assignment Expression Expression
    | Expression Expression
    | Break
    | While Expression [Statement]

data Expression
    | Attribute Expression String
    | Constant Value

data Value
    = String String
    | Int Integer
    | None

To evaluate statements, I use eval:

eval (Assignment (Variable var) expr) = do
    value <- evalExpr expr
    updateSymbol var value

eval (Expression e) = do
    _ <- evalExpr e
    return ()

To evaluate expressions, I use evalExpr:

evalExpr (Attribute target name) = do
    receiver <- evalExpr target
    attribute <- getAttr name receiver
    case attribute of
        Just v  -> return v
        Nothing -> fail $ "No attribute " ++ name

evalExpr (Constant c) = return c

What motivated the whole thing was the shenanigans required for implementing break. The break definition is reasonable, but what it does to the while definition is a bit much:

eval (Break) = do
    env <- get
    when (loopLevel env <= 0) (fail "Can only break in a loop!")
    put env { flow = Breaking }

eval (While condition block) = do
    setup
    loop
    cleanup

    where
        setup = do
            env <- get
            let level = loopLevel env
            put env { loopLevel = level + 1 }

        loop = do
            env <- get
            result <- evalExpr condition
            when (isTruthy result && flow env == Next) $ do
                evalBlock block

                -- Pretty ugly! Eat continue.
                updatedEnv <- get
                when (flow updatedEnv == Continuing) $ put updatedEnv { flow = Next }

                loop

        cleanup = do
            env <- get
            let level = loopLevel env
            put env { loopLevel = level - 1 }

            case flow env of
                Breaking    -> put env { flow = Next }
                Continuing  -> put env { flow = Next }
                _           -> return ()

I am sure there are more simplifications that can be done here, but the core problem is one of stuffing state somewhere and manually winding out. I'm hoping that CPS will let me stuff book-keeping (like loop exit points) into state and just use those when I need them.

I dislike the split between statements and expressions and worry it might make the CPS transform more work.

Crinkle answered 18/8, 2014 at 14:35 Comment(9)
I can't write up an example right now, but here's a link to a writeup of some code from a talk Erik Meijer gave on this topic: gist.github.com/cluno/b8d49b9de848025aed20Gregory
That is an excellent talk; he moves quickly through the proof of CPS though. I have been studying it for about a week now. I feel like I'm still trying to get a conceptual foothold on this currently.Crinkle
The basic idea is to transform your normal interpreter which might be like i :: AST -> Val into something like iCPS :: AST -> (Val -> r) -> r -> r where (Val -> r) is the success continuation (since you must successfully produce a Val to run it and r is a failure continuation (because you don't need a Val to run it). Then you write i = iCPS id (error "uh oh") or something like that. There's your CPS transform.Gregory
I think generally you'd be better off for gathering answers here if you write up a minimal example of the kind of AST you're writing which you'd like to interpret with CPS semantics.Gregory
Could you cut the AST down to the essentials of your question? I've voted to close the question as too broad based on the unedited version, but I'd be happy to retract that with a suitably narrowed version. The AST helps, but it's quite big. Also it might help if you add an existing non-CPS evaluator, so that your question becomes something like "how can I add break, continue and return to this evaluator with CPS".Genaro
Good idea, I have added some code to illustrate what my current approach is.Crinkle
You'll want to include a loop-breaker or try-catch block in order to motivate the CPS transform. So far the AST only has direct returns.Gregory
Done. Hope it gets the point across; namely that the control flow operators are hard to isolate from one another.Crinkle
Finished code appears to be at github.com/mattgreen/hythonEbony
E
10

This finally gave me a good excuse to try using ContT!

Here's one possible way of doing this: store (in a Reader wrapped in ContT) the continuation of exiting the current (innermost) loop:

newtype M r a = M{ unM :: ContT r (ReaderT (M r ()) (StateT (Map Id Value) IO)) a }
              deriving ( Functor, Applicative, Monad
                       , MonadReader (M r ()), MonadCont, MonadState (Map Id Value)
                       , MonadIO
                       )

runM :: M a a -> IO a
runM m = evalStateT (runReaderT (runContT (unM m) return) (error "not in a loop")) M.empty

withBreakHere :: M r () -> M r ()
withBreakHere act = callCC $ \break -> local (const $ break ()) act

break :: M r ()
break = join ask

(I've also added IO for easy printing in my toy interpreter, and State (Map Id Value) for variables).

Using this setup, you can write Break and While as:

eval Break = break
eval (While condition block) = withBreakHere $ fix $ \loop -> do
    result <- evalExpr condition
    unless (isTruthy result)
      break
    evalBlock block
    loop

Here's the full code for reference:

{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Interp where

import Prelude hiding (break)
import Control.Applicative
import Control.Monad.Cont
import Control.Monad.State
import Control.Monad.Reader
import Data.Function
import Data.Map (Map)
import qualified Data.Map as M
import Data.Maybe

type Id = String

data Statement
    = Print Expression
    | Assign Id Expression
    | Break
    | While Expression [Statement]
    | If Expression [Statement]
    deriving Show

data Expression
    = Var Id
    | Constant Value
    | Add Expression Expression
    | Not Expression
    deriving Show

data Value
    = String String
    | Int Integer
    | None
    deriving Show

data Env = Env{ loopLevel :: Int
              , flow :: Flow
              }

data Flow
    = Breaking
    | Continuing
    | Next
    deriving Eq

newtype M r a = M{ unM :: ContT r (ReaderT (M r ()) (StateT (Map Id Value) IO)) a }
              deriving ( Functor, Applicative, Monad
                       , MonadReader (M r ()), MonadCont, MonadState (Map Id Value)
                       , MonadIO
                       )

runM :: M a a -> IO a
runM m = evalStateT (runReaderT (runContT (unM m) return) (error "not in a loop")) M.empty

withBreakHere :: M r () -> M r ()
withBreakHere act = callCC $ \break -> local (const $ break ()) act

break :: M r ()
break = join ask

evalExpr :: Expression -> M r Value
evalExpr (Constant val) = return val
evalExpr (Var v) = gets $ fromMaybe err . M.lookup v
  where
    err = error $ unwords ["Variable not in scope:", show v]
evalExpr (Add e1 e2) = do
    Int val1 <- evalExpr e1
    Int val2 <- evalExpr e2
    return $ Int $ val1 + val2
evalExpr (Not e) = do
    val <- evalExpr e
    return $ if isTruthy val then None else Int 1

isTruthy (String s) = not $ null s
isTruthy (Int n) = n /= 0
isTruthy None = False

evalBlock = mapM_ eval

eval :: Statement -> M r ()
eval (Assign v e) = do
    val <- evalExpr e
    modify $ M.insert v val
eval (Print e) = do
    val <- evalExpr e
    liftIO $ print val
eval (If cond block) = do
    val <- evalExpr cond
    when (isTruthy val) $
      evalBlock block
eval Break = break
eval (While condition block) = withBreakHere $ fix $ \loop -> do
    result <- evalExpr condition
    unless (isTruthy result)
      break
    evalBlock block
    loop

and here's a neat test example:

prog = [ Assign "i" $ Constant $ Int 10
       , While (Var "i") [ Print (Var "i")
                         , Assign "i" (Add (Var "i") (Constant $ Int (-1)))
                         , Assign "j" $ Constant $ Int 10
                         , While (Var "j") [ Print (Var "j")
                                           , Assign "j" (Add (Var "j") (Constant $ Int (-1)))
                                           , If (Not (Add (Var "j") (Constant $ Int (-4)))) [ Break ]
                                           ]
                         ]
       , Print $ Constant $ String "Done"
       ]

which is

i = 10
while i:
  print i
  i = i - 1
  j = 10
  while j:
    print j
    j = j - 1
    if j == 4:
      break

so it will print

10 10 9 8 7 6 5
 9 10 9 8 7 6 5
 8 10 9 8 7 6 5
...
 1 10 9 8 7 6 5
Ebony answered 19/8, 2014 at 11:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.