IO monad prevents short circuiting of embedded mapM?
Asked Answered
B

2

11

Somewhat mystified by the following code. In non-toy version of the problem I'm trying to do a monadic computation in a monad Result, the values of which can only be constructed from within IO. Seems like the magic behind IO makes such computations strict, but I can't figure out how exactly that happens.

The code:

data Result a = Result a | Failure deriving (Show)

instance Functor Result where
  fmap f (Result a) = Result (f a)
  fmap f Failure = Failure

instance Applicative Result where
  pure = return
  (<*>) = ap

instance Monad Result where
  return = Result
  Result a >>= f = f a
  Failure >>= _ = Failure

compute :: Int -> Result Int
compute 3 = Failure
compute x = traceShow x $ Result x

compute2 :: Monad m => Int -> m (Result Int)
compute2 3 = return Failure
compute2 x = traceShow x $ return $ Result x

compute3 :: Monad m => Int -> m (Result Int)
compute3 = return . compute

main :: IO ()
main = do
  let results = mapM compute [1..5]
  print $ results
  results2 <- mapM compute2 [1..5]
  print $ sequence results2
  results3 <- mapM compute3 [1..5]
  print $ sequence results3
  let results2' = runIdentity $ mapM compute2 [1..5]
  print $ sequence results2'

The output:

1
2
Failure
1
2
4
5
Failure
1
2
Failure
1
2
Failure
Berryman answered 31/5, 2016 at 6:40 Comment(0)
S
10

Nice test cases. Here's what's happening:

  • in mapM compute we see laziness at work, as usual. No surprise here.

  • in mapM compute2 we work inside the IO monad, whose mapM definition will demand the whole list: unlike Result which skips the tail of the list as soon as Failure is found, IO will always scan the whole list. Note the code:

    compute2 x = traceShow x $ return $ Result x
    

    So, the above wil print the debug message as soon as each element of the list of IO actions is accessed. All are, so we print everything.

  • in mapM compute3 we now use, roughly:

    compute3 x = return $ traceShow x $ Result x
    

    Now, since return in IO is lazy, it will not trigger the traceShow when returning the IO action. So, when mapM compute3 is run, no message is seen. Instead, we see messages only when sequence results3 is run, which forces the Result -- not all of them, but only as much as needed.

  • the final Identity example is also quite tricky. Note this:

    > newtype Id1 a = Id1 a
    > data Id2 a = Id2 a
    > Id1 (trace "hey!" True) `seq` 42
    hey!
    42
    > Id2 (trace "hey!" True) `seq` 42
    42
    

    when using a newtype, at runtime there is no boxing/unboxing (AKA lifting) involved, so forcing a Id1 x value causes x to be forced. With data types this does not happen: the value is wrapped in a box (e.g. Id2 undefined is not equivalent to undefined).

    In your example, you add an Identity constructor, but that is from the newtype Identity!! So, when calling

    return $ traceShow x $ Result x
    

    the return here does not wrap anything, and the traceShow is immediately triggered as soon as mapM is run.

Short answered 31/5, 2016 at 7:7 Comment(3)
Thank you very much for your answer, chi. May I ask how you know that the IO definition of mapM is strict and that return is lazy?Berryman
@Berryman Failure >>= f = Failure discards the f: there's no need to proceed further on the monadic chain. IO has a lower lever definition which is not easily written, but -- barring exceptions -- action >>= f will always call f since one expects that e.g. action >> print 4 will eventually print 4 no matter what action does (barring IO exceptions and non-termination).Short
Right. Thanks again!Berryman
R
1

Your Result type appears to be virtually identical to Maybe, with

Result <-> Just
Failure <-> Nothing

For the sake of my poor brain, I'll stick to Maybe terminology in the rest of this answer.

chi explained why IO (Maybe a) does not short-circuit the way you expected. But there is a type you can use for this sort of thing! It's essentially the same type, in fact, but with a different Monad instance. You can find it in Control.Monad.Trans.Maybe. It looks something like this:

newtype MaybeT m a = MaybeT
  { runMaybeT :: m (Maybe a) }

As you can see, this is just a newtype wrapper around m (Maybe a). But its Monad instance is very different:

instance Monad m => Monad (MaybeT m) where
  return a = MaybeT $ return (Just a)
  m >>= f = MaybeT $ do
    mres <- runMaybeT m
    case mres of
      Nothing -> return Nothing
      Just a -> runMaybeT (f a)

That is, m >>= f runs the m computation in the underlying monad, getting Maybe something or other. If it gets Nothing, it just stops, returning Nothing. If it gets something, it passes that to f and runs the result. You can also turn any m action into a "successful" MaybeT m action using lift from Control.Monad.Trans.Class:

class MonadTrans t where
  lift :: Monad m => m a -> t m a

instance MonadTrans MaybeT where
  lift m = MaybeT $ Just <$> m

You can also use this class, defined somewhere like Control.Monad.IO.Class, which is often clearer and can be much more convenient:

class MonadIO m where
  liftIO :: IO a -> m a

instance MonadIO IO where
  liftIO m = m

instance MonadIO m => MonadIO (MaybeT m) where
  liftIO m = lift (liftIO m)
Raffinate answered 31/5, 2016 at 19:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.