Haskell: performance of IORefs

Asked 5/6, 2014 at 19:13 Answered 5/6, 2014 at 23:20

I have been trying to encode an algorithm in Haskell that requires using lots of mutable references, but it is (perhaps not surprisingly) very slow in comparison to purely lazy code. Consider a very simple example:

module Main where

import Data.IORef
import Control.Monad
import Control.Monad.Identity

list :: [Int]
list = [1..10^6]

main1 = mapM newIORef list >>= mapM readIORef >>= print
main2 = print $ map runIdentity $ map Identity list

Running GHC 7.8.2 on my machine, main1 takes 1.2s and uses 290MB of memory, while main2 takes only 0.4s and uses a mere 1MB. Is there any trick to prevent this growth, especially in space? I often need IORefs for non-primitive types unlike Int, and assumed that an IORef would use an additional pointer much like a regular thunk, but my intuition seems to be wrong.

I have already tried a specialized list type with an unpacked IORef, but with no significant difference.

Ramage answered 5/6, 2014 at 19:13 Comment(1)

Your edit should really be an answer to the question, and perhaps a separate question for the Control.Monad.ST.Lazy bit. – Priority 5/6, 2014 at 21:59

The problem is your use of mapM, which always performs poorly on large lists both in time and space. The correct solution is to fuse away the intermediate lists by using mapM_ and (>=>):

import Data.IORef
import Control.Monad

list :: [Int]
list = [1..10^6]

main = mapM_ (newIORef >=> readIORef >=> print) list

This runs in constant space and gives excellent performance, running in 0.4 seconds on my machine.

Edit: In answer to your question, you can also do this with pipes to avoid having to manually fuse the loop:

import Data.IORef
import Pipes
import qualified Pipes.Prelude as Pipes

list :: [Int]
list = [1..10^6]

main = runEffect $
    each list >-> Pipes.mapM newIORef >-> Pipes.mapM readIORef >-> Pipes.print

This runs in constant space in about 0.7 seconds on my machine.

Pagan answered 5/6, 2014 at 23:20 Comment(4)

By the way, does the pipes library allow to specify this sort of behaviour (without fusing the two mapMs)? – Ramage 6/6, 2014 at 20:31

@Ramage Yes. I updated my answer with the equivalent pipes solution. I actually wrote the solution in pipes first and then used equational reasoning to transform it into the equivalent hand-written loop. – Pagan 6/6, 2014 at 22:6

Hmm, but this still kinds of maps over each element on the list, as if it was a single map. For instance, how would print the resulting list instead of each element on its own? – Ramage 6/6, 2014 at 22:49

@Ramage If you want to print the resulting list, then use Pipes.toListM instead of Pipes.print. That will fold the pipeline's output into a list, but more efficiently than mapM. – Pagan 7/6, 2014 at 1:58

This is very likely not about IORef, but about strictness. Actions in the IO monad are serial -- all previous actions must complete before the next one can be started. So

mapM newIORef list

generates a million IORefs before anything is read.

However,

map runIdentity . map Identity
= map (runIdentity . Identity)
= map id

which streams very nicely, so we print one element of the list, then generate the next one, etc.

If you want a fairer comparison, use a strict map:

map' :: (a -> b) -> [a] -> [b]
map' f [] = []
map' f (x:xs) = (f x:) $! map' f xs

Foulup answered 5/6, 2014 at 19:22 Comment(1)

Thanks. Now that I look into it, I have probably simplified the problem too much, and it is not very surprising if GHC is actually simplifying the second map. But anyway, a function like main = mapM (\r -> newIORef r >>= readIORef) list >>= print still uses considerable time and space. If this is about strictness, then I wouldn't a lazy ST monad improve significantly? Something like main = print $ runST (mapM newSTRef list >>= mapM readSTRef). For the record, it is even slower that using IORefs. – Ramage 5/6, 2014 at 19:58

I have found that the hack towards a solution is to use a lazy mapM instead, defined as

lazyMapM :: (a -> IO b) -> [a] -> IO [b]
lazyMapM f [] = return []
lazyMapM f (x:xs) = do
  y <-  f x
  ys <- unsafeInterleaveIO $ lazyMapM f xs
  return (y:ys)

This allows the monadic version to run within the same 1MB and similar time. I would expect that a lazy ST monad could solve this problem more elegantly without using unsafeInterleaveIO, as a function:

main = print $ runST (mapM (newSTRef) list >>= mapM (readSTRef))

but that does not work (you also need to use unsafeInterleaveST), what leaves me thinking about how lazy the Control.Monad.ST.Lazy really is. Does someone know? :)

Ramage answered 5/6, 2014 at 23:4 Comment(6)

ST.Lazy is not as lazy as it seems it could be. All operations on the state are still sequential -- i.e. the moment you newSTRef, readSTRef, etc. the entire state history is forced. It's lazy in the sense that not depending on the state can operate lazily; e.g. foo = fmap (1:) foo will give an infinite list, not bottom. It's a bit disappointing, I know... – Foulup 6/6, 2014 at 1:24

(Of course, it can't actually be any lazier than that; consider if one of the intermediate operations changed the variable you are reading? We have to force them all to make sure.) – Foulup 6/6, 2014 at 1:26

Sorry, I have already created a new question on this but only noticed your reply now. Can this answer be posted there instead? – Ramage 6/6, 2014 at 1:51

But you could at least only force the whole history at the time you read. I do not see a reason for mapM (newSTRef) list >>= mapM (readSTRef) not being more lazy than it is... Ignoring the history once you force the result is what does not seem possible. – Ramage 6/6, 2014 at 2:28

newSTRef and readSTRef theoretically get along with laziness just fine. writeSTRef doesn't. We have to force all the newSTRefs the first time we readSTRef, because we don't know in advance that one of those wasn't a writeSTRef of a previously created STRef -- the semantics can't see that the first mapM only has newSTRefs in it. Does that answer your question? – Foulup 6/6, 2014 at 2:33

It does. Summing it up, we could safely group/delay sequences of newSTRefs and readSTRefs, but as soon as we can have writeSTRefs in play we have to replay the whole history. – Ramage 6/6, 2014 at 3:3

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags