Tying the Knot with a State monad

S

5

40

I'm working on a Haskell project that involves tying a big knot: I'm parsing a serialized representation of a graph, where each node is at some offset into the file, and may reference another node by its offset. So I need to build up a map from offsets to nodes while parsing, which I can feed back to myself in a do rec block.

I have this working, and kinda-sorta-reasonably abstracted into a StateT-esque monad transformer:

{-# LANGUAGE DoRec, GeneralizedNewtypeDeriving #-}

import qualified Control.Monad.State as S

data Knot s = Knot { past :: s, future :: s }

newtype RecStateT s m a = RecStateT (S.StateT (Knot s) m a) deriving
  ( Alternative
  , Applicative
  , Functor
  , Monad
  , MonadCont
  , MonadError e
  , MonadFix
  , MonadIO
  , MonadPlus
  , MonadReader r
  , MonadTrans
  , MonadWriter w )

runRecStateT :: RecStateT s m a -> Knot s -> m (a, Knot s)
runRecStateT (RecStateT st) = S.runStateT st

tie :: MonadFix m => RecStateT s m a -> s -> m (a, s)
tie m s = do
  rec (a, Knot s' _) <- runRecStateT m (Knot s s')
  return (a, s')

get :: Monad m => RecStateT s m (Knot s)
get = RecStateT S.get

put :: Monad m => s -> RecStateT s m ()
put s = RecStateT $ S.modify $ \ ~(Knot _ s') -> Knot s s'

The tie function is where the magic happens: the call to runRecStateT produces a value and a state, which I feed it as its own future. Note that get allows you to read from both the past and future states, but put only allows you to modify the "present."

Question 1: Does this seem like a decent way to implement this knot-tying pattern in general? Or better still, has somebody implemented a general solution to this, that I overlooked when snooping through Hackage? I beat my head against the Cont monad for a while, since it seemed possibly more elegant (see similar post from Dan Burton), but I just couldn't work it out.

Totally subjective Question 2: I'm not totally thrilled with the way my calling code ends up looking:

do
  Knot past future <- get
  let {- ... -} = past
      {- ... -} = future
      node = {- ... -}
  put $ {- ... -}
  return node

Implementation details here omitted, obviously, the important point being that I have to get the past and future state, pattern-match them inside a let binding (or explicitly make the previous pattern lazy) to extract whatever I care about, then build my node, update my state and finally return the node. Seems unnecessarily verbose, and I particularly dislike how easy it is to accidentally make the pattern that extracts the past and future states strict. So, can anybody think of a nicer interface?

Sleeping answered 16/6, 2012 at 3:34 Comment(1)

Even though they're building up the same thing, you could use a different type for past and future state. Then you wouldn't be able to accidentally mix them up. That's what I would do anyway. – Soane 19/6, 2012 at 0:23

I

8

I've been playing around with stuff, and I think I've come up with something... interesting. I call it the "Seer" monad, and it provides (aside from Monad operations) two primitive operations:

see  :: Monoid s => Seer s s
send :: Monoid s => s -> Seer s ()

and a run operation:

runSeer :: Monoid s => Seer s a -> a

The way this monad works is that see allows a seer to see everything, and send allows a seer to "send" information to all other seers for them to see. Whenever any seer performs the see operation, they are able to see all of the information that has been sent, and all of the information that will be sent. In other words, within a given run, see will always produce the same result no matter where or when you call it. Another way of saying it is that see is how you get a working reference to the "tied" knot.

This is actually very similar to just using fix, except that all of the sub-parts are added incrementally and implicitly, rather than explicitly. Obviously, seers will not work correctly in the presence of a paradox, and sufficient laziness is required. For example, see >>= send may cause an explosion of information, trapping you in a time loop.

A dumb example:

import Control.Seer
import qualified Data.Map as M
import Data.Map (Map, (!))

bar :: Seer (Map Int Char) String
bar = do
  m <- see
  send (M.singleton 1 $ succ (m ! 2))
  send (M.singleton 2 'c')
  return [m ! 1, m ! 2]

As I said, I've just been toying around, so I have no idea if this is any better than what you've got, or if it's any good at all! But it's nifty, and relevant, and if your "knot" state is a Monoid, then it just might be useful to you. Fair warning: I built Seer by using a Tardis.

https://github.com/DanBurton/tardis/blob/master/Control/Seer.hs

Ivanivana answered 19/6, 2012 at 1:39 Comment(6)

Interesting indeed, nifty, and +1 for using a tardis! I was toying with the idea of letting the state be a monoid, which indeed simplifies the interface (that's a good thing!) But I decided that was too restrictive: one can't observe aspects of the past state, only append to the future. I figured that won't fly for me, because I need to keep track of my file offset as I parse... But now it occurs to me that I can layer a StateT for my transient state on top of something akin to Seer. Seems like that might be a better design! – Sleeping 19/6, 2012 at 2:4

Holy crap. I think RWST might actually be exactly what I need. The reader component would be the "future" (per @RomanCheplyaka's comment), the writer would be the "present" (per your seer) and the state would be anything transient (like current file offset)... – Sleeping 19/6, 2012 at 2:44

@Sleeping interesting. Be sure to blog about it once you've got something figured out! I'm sure it would receive a warm welcome at reddit.com/r/haskell . – Ivanivana 19/6, 2012 at 7:2

I updated my code to use RWST. It's now called OracleT :) – Sleeping 19/6, 2012 at 19:46

@Sleeping It turns out you don't actually need a Tardis; I've reimplemented SeerT s a as a newtype around ReaderT s (WriterT s m) a. see and send are just lifted versions of ask and tell, respectively. Then runSeerT just uses mfix to pipe the writer's final result straight back into the reader. github.com/DanBurton/tardis/blob/master/Control/SeerRW.hs – Ivanivana 20/6, 2012 at 8:47

I finally got around to blogging a simplified version of all this: mergeconflict.com/?p=57. Post it wherever you like! – Sleeping 4/7, 2012 at 2:45

D

8

I wrote up an article on this topic at entitled Assembly: Circular Programming with Recursive do where I describe two methods for building an assembler using knot tying. Like your problem, an assembler has to be able to resolve address of labels that may occur later in the file.

Devlin answered 18/6, 2012 at 20:12 Comment(1)

Hah, small world. I've probably read that article ten times at this point, and based my idea pretty much entirely on it :) – Sleeping 18/6, 2012 at 23:29

D

8

Regarding the implementation, I would make it a composition of a Reader monad (for the future) and a State monad (for past/present). The reason is that you set your future only once (in tie) and then don't change it.

{-# LANGUAGE DoRec, GeneralizedNewtypeDeriving #-}

import Control.Monad.State
import Control.Monad.Reader
import Control.Applicative

newtype RecStateT s m a = RecStateT (StateT s (ReaderT s m) a) deriving
  ( Alternative
  , Applicative
  , Functor
  , Monad
  , MonadPlus
  )

tie :: MonadFix m => RecStateT s m a -> s -> m (a, s)
tie (RecStateT m) s = do
  rec (a, s') <- flip runReaderT s' $ flip runStateT s m
  return (a, s')

getPast :: Monad m => RecStateT s m s
getPast = RecStateT get

getFuture :: Monad m => RecStateT s m s
getFuture = RecStateT ask

putPresent :: Monad m => s -> RecStateT s m ()
putPresent = RecStateT . put

Regarding your second question, it'd help to know your dataflow (i.e. to have a minimal example of your code). It's not true that strict patterns always lead to loops. It's true that you need to be careful so as not to create a non-producing loop, but the exact restrictions depend on what and how you're building.

Duluth answered 18/6, 2012 at 20:21 Comment(2)

Interesting. Is there any advantage to using Reader, though, when the API I expose is essentially identical? The code as it stands is on github. – Sleeping 19/6, 2012 at 0:13

Yes, there is. By using Reader, you statically inforce the invariant that the future cannot be changed. This doesn't matter too much, as it is only your small library code that is forced to be correct by this and the optimization opportunities are small if they exist. However, it is good form and makes the code simpler. – Charwoman 19/6, 2012 at 0:29

I

8

I've been playing around with stuff, and I think I've come up with something... interesting. I call it the "Seer" monad, and it provides (aside from Monad operations) two primitive operations:

see  :: Monoid s => Seer s s
send :: Monoid s => s -> Seer s ()

and a run operation:

runSeer :: Monoid s => Seer s a -> a

The way this monad works is that see allows a seer to see everything, and send allows a seer to "send" information to all other seers for them to see. Whenever any seer performs the see operation, they are able to see all of the information that has been sent, and all of the information that will be sent. In other words, within a given run, see will always produce the same result no matter where or when you call it. Another way of saying it is that see is how you get a working reference to the "tied" knot.

This is actually very similar to just using fix, except that all of the sub-parts are added incrementally and implicitly, rather than explicitly. Obviously, seers will not work correctly in the presence of a paradox, and sufficient laziness is required. For example, see >>= send may cause an explosion of information, trapping you in a time loop.

A dumb example:

import Control.Seer
import qualified Data.Map as M
import Data.Map (Map, (!))

bar :: Seer (Map Int Char) String
bar = do
  m <- see
  send (M.singleton 1 $ succ (m ! 2))
  send (M.singleton 2 'c')
  return [m ! 1, m ! 2]

As I said, I've just been toying around, so I have no idea if this is any better than what you've got, or if it's any good at all! But it's nifty, and relevant, and if your "knot" state is a Monoid, then it just might be useful to you. Fair warning: I built Seer by using a Tardis.

https://github.com/DanBurton/tardis/blob/master/Control/Seer.hs

Ivanivana answered 19/6, 2012 at 1:39 Comment(6)

Interesting indeed, nifty, and +1 for using a tardis! I was toying with the idea of letting the state be a monoid, which indeed simplifies the interface (that's a good thing!) But I decided that was too restrictive: one can't observe aspects of the past state, only append to the future. I figured that won't fly for me, because I need to keep track of my file offset as I parse... But now it occurs to me that I can layer a StateT for my transient state on top of something akin to Seer. Seems like that might be a better design! – Sleeping 19/6, 2012 at 2:4

Holy crap. I think RWST might actually be exactly what I need. The reader component would be the "future" (per @RomanCheplyaka's comment), the writer would be the "present" (per your seer) and the state would be anything transient (like current file offset)... – Sleeping 19/6, 2012 at 2:44

@Sleeping interesting. Be sure to blog about it once you've got something figured out! I'm sure it would receive a warm welcome at reddit.com/r/haskell . – Ivanivana 19/6, 2012 at 7:2

I updated my code to use RWST. It's now called OracleT :) – Sleeping 19/6, 2012 at 19:46

@Sleeping It turns out you don't actually need a Tardis; I've reimplemented SeerT s a as a newtype around ReaderT s (WriterT s m) a. see and send are just lifted versions of ask and tell, respectively. Then runSeerT just uses mfix to pipe the writer's final result straight back into the reader. github.com/DanBurton/tardis/blob/master/Control/SeerRW.hs – Ivanivana 20/6, 2012 at 8:47

I finally got around to blogging a simplified version of all this: mergeconflict.com/?p=57. Post it wherever you like! – Sleeping 4/7, 2012 at 2:45

S

1

I had a similar problem recently, but I chose a different approach. A recursive data structure can be represented as a type fixed point on a data type functor. Loading data can be then split into two parts:

Load the data into a structure that references other nodes only by some kind of identifier. In the example it's Loader Int (NodeF Int), which constructs a map of values of type NodeF Int Int.
Tie the knot by creating a recursive data structure by replacing the identifiers with actual data. In the example the resulting data structures have type Fix (NodeF Int), and they are later converted to Node Int for convenience.

It's lacking a proper error handling etc., but the idea should be clear from that.

-- Public Domain

import Control.Monad
import Data.Map (Map)
import qualified Data.Map as Map
import Data.Maybe (fromJust)

-- Fixed point operator on types and catamohism/anamorphism methods
-- for constructing/deconstructing them:

newtype Fix f = Fix { unfix :: f (Fix f) }

catam :: Functor f => (f a -> a) -> (Fix f -> a)
catam f = f . fmap (catam f) . unfix

anam :: Functor f => (a -> f a) -> (a -> Fix f)
anam f = Fix . fmap (anam f) . f

anam' :: Functor f => (a -> f a) -> (f a -> Fix f)
anam' f = Fix . fmap (anam f)

-- The loader itself

-- A representation of a loader. Type parameter 'k' represents the keys by
-- which the nodes are represented. Type parameter 'v' represents a functor
-- data type representing the values.
data Loader k v = Loader (Map k (v k))

-- | Creates an empty loader.
empty :: Loader k v
empty = Loader $ Map.empty

-- | Adds a new node into a loader.
update :: (Ord k) => k -> v k -> Loader k v -> Loader k v
update k v = update' k (const v)

-- | Modifies a node in a loader.
update' :: (Ord k) => k -> (Maybe (v k) -> (v k)) -> Loader k v -> Loader k v
update' k f (Loader m) = Loader $ Map.insertWith (const (f . Just)) k (f Nothing) $ m

-- | Does the actual knot-tying. Creates a new data structure
-- where the references to nodes are replaced by the actual data.
tie :: (Ord k, Functor v) => Loader k v -> Map k (Fix v)
tie (Loader m) = Map.map (anam' $ \k -> fromJust (Map.lookup k m)) m


-- -----------------------------------------------------------------
-- Usage example:

data NodeF n t = NodeF n [t]
instance Functor (NodeF n) where
    fmap f (NodeF n xs) = NodeF n (map f xs)

-- A data structure isomorphic to Fix (NodeF n), but easier to work with.
data Node n = Node n [Node n]
  deriving Show
-- The isomorphism that does the conversion.
nodeunfix :: Fix (NodeF n) -> Node n
nodeunfix = catam (\(NodeF n ts) -> Node n ts)

main :: IO ()
main = do
    -- Each node description consist of an integer ID and a list of other nodes
    -- it references.
    let lss = 
            [ (1, [4])
            , (2, [1])
            , (3, [2, 1])
            , (4, [3, 2, 1])
            , (5, [5])
            ]
    print lss
    -- Fill a new loader with the data:
    let
        loader = foldr f empty lss
        f (label, dependsOn) = update label (NodeF label dependsOn)
    -- Tie the knot:
    let tied' = tie loader
    -- And convert Fix (NodeF n) into Node n:
    let tied = Map.map nodeunfix tied'

    -- For each node print the label of the first node it references
    -- and the count of all referenced nodes.
    print $ Map.map (\(Node n ls@((Node n1 _) : _)) -> (n1, length ls)) tied

Schnabel answered 26/7, 2012 at 10:1 Comment(0)

B

0

I'm kind of overwhelmed by the amount of Monad usage. I might not understand the past/future things, but I guess you are just trying to express the lazy+fixpoint binding. (Correct me if I'm wrong.) The RWS Monad usage with R=W is kind of funny, but you do not need the State and the loop, when you can do the same with fmap. There is no point in using Monads if they do not make things easier. (Only very few Monads represent chronological order, anyway.)

My general solution to tying the knot:

I parse everything to a List of nodes,
convert that list to a Data.Vector for O(1) access to boxed (=lazy) values,
bind that result to a name using let or the fix or mfix function,
and access that named Vector inside the parser. (see 1.)

That example solution in your blog, where you write sth. like this:

data Node = Node {
  value :: Int,
  next  :: Node
} deriving Show
…
tie = …
parse = …
data ParserState = …
…
example :: Node
example =
  let (_, _, m) = tie parse $ ParserState 0 [(0, 1), (1, 2), (2, 0)]
  in (m Map.! 0)

I would have written this way:

{-# LANGUAGE ViewPatterns, NamedFieldPuns #-}
import Data.Vector as Vector

example :: Node
example =
   let node :: Int -> Node
       node = (Vector.!) $ Vector.fromList $
                   [ Node{value,next}
                   | (value,node->next) <- [(0, 1), (1, 2), (2, 0)]
                   ]
   in (node 0)

or shorter:

{-# LANGUAGE ViewPatterns, NamedFieldPuns #-}
import Data.Vector as Vector

example :: Node
example = (\node->(Vector.fromList[ Node{value,next}
                                  | (value,node->next) <- [(0, 1), (1, 2), (2, 0)]
                                  ] Vector.!)) `fix` 0

Bismuthous answered 15/7, 2012 at 23:34 Comment(0)

Recommended topics

Hot tags