Proc syntax in Haskell Arrows leads to severe performance penalty
Asked Answered
R

2

15

Using proc notation for Arrow seems to kill performance in my project. Here is a toy example of the problem:

We define Coroutine newtype (mostly copying from Generalizing Streams into Coroutines) to represent Mealy machines (i.e. functions that carry some state) with instances of Category and Arrow, write scan wrapper function and evalList runner function for lists.

Then we have sumArr and sumArr' functions where the latter is the former called within proc block.

Compiling with stack ghc -- --make test.hs -O2 using ghc-8.0.2 on OS X I get runtime of 0.087 secs for sumArr and 3.263 secs for sumArr' (with a heavy memory footprint).

I would like to know if this in fact caused by using proc and if I can do something to have normal runtime behaviour when using proc notation (writing arrow code without it is painful). Thank you.

{-# LANGUAGE Arrows #-}
{-# LANGUAGE BangPatterns #-}

import Prelude hiding (id, (.))
import Control.Arrow
import Control.Category
import qualified Data.List as L

newtype Coroutine i o = Coroutine { runC :: i -> (o, Coroutine i o) }

instance Category Coroutine where
    id = Coroutine $ \i -> (i, id)

    cof . cog = Coroutine $ \i ->
        let (x, cog') = runC cog i
            (y, cof') = runC cof x
        in (y, cof' . cog')

instance Arrow Coroutine where
    arr f = Coroutine $ \i -> (f i, arr f)

    first co = Coroutine $ \(a,b) ->
        let (c, co') = runC co a in ((c,b), first co')

scan :: (o -> t -> o) -> o -> Coroutine t o
scan f = go where
    go i = Coroutine $ step i where
            step a b = let !a' = f a b in (a', go a')

evalList :: Coroutine a b -> [a] -> [b]
evalList a = L.map fst . L.drop 1 . L.scanl' (\(_, acc) v -> let !x = runC acc v in x) (undefined, a)

sumArr, sumArr' :: Coroutine Int Int
sumArr = scan (\acc x -> let !newAcc = acc + x in newAcc) 0
sumArr' = proc v -> do sumArr -< v

testData :: [Int]
testData = [1..1000000]

main = print $ L.last $ evalList sumArr' testData
Requiescat answered 22/7, 2017 at 23:22 Comment(1)
Have you looked at the core to see how the two functions differ?Moretta
W
9

Yeah, this is probably caused by proc notation. The desugaring is very low-level, introducing a lot of (needless) arrs and not taking advantage of &&& or *** at all.

For example, last I checked, this:

mulA f g = proc x -> do
  a <- f -< x
  b <- g -< x
  returnA -< a * b

Is desugared to something like this:

mulA f g = arr dup
  >>> first f
  >>> arr swap
  >>> first g
  >>> arr mul
  where
    dup x = (x, x)
    swap (x, y) = (y, x)
    mul = uncurry (*)

When it could be just this:

mulA f g = f &&& g >>> arr mul

And this:

proc x -> do
  a <- f -< x
  b <- g -< a
  returnA -< b

Becomes something like this:

arr id
  >>> f
  >>> arr id
  >>> g
  >>> arr id
  >>> returnA

Instead of this:

f >>> g

Moreover I don’t think there are any GHC rewrite rules that take advantage of the arrow laws to help account for this.

Wylie answered 23/7, 2017 at 1:27 Comment(3)
So, what to do? Could we write our own rewrite rules?Favianus
@AJFarmar: That seems like the logical next step. I suspect that rewrite rules won’t be able to address all of the issues, though, so I’d also suggest filing a GHC ticket, or posting a message to the Glasgow Haskell Users or Haskell Cafe lists, to gauge interest in a proposal for an improved desugaring. I guess the main reason it’s unoptimised is that there are relatively few people using arrow notation.Wylie
Thank you for your answer and suggestions, I will ask this on the mailing list as well and post back here.Requiescat
R
5

I have found arrowp-qq which wraps proc blocks inside quasiquotes and seems to produce better output than native desugarer. Performance is restored in the following version of our example:

{-# LANGUAGE QuasiQuotes #-}
...
import Control.Arrow.QuasiQuoter
...
sumArrQQ = [proc| x -> do sumArr -< x |]

One issue that I bumped into is that these quasiquotes don't play nicely with raw numbers inside the quotation.

sumArrQQ' = [proc| x -> do sumArr -< x + 2 |] -- gives an error

sumArrQQ'' = [proc| x -> do sumArr -< plus2 x |] -- compiles fine
    where plus2 = (+) 2
Requiescat answered 2/8, 2017 at 17:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.