What are the key differences between the Repa 2 and 3 APIs?

Asked 25/5, 2012 at 0:34 Answered 27/5, 2012 at 11:22

Solved haskell image-processing parallel-processing repa data-parallel-haskell

To be more specific, I have the following innocuous-looking little Repa 3 program:

{-# LANGUAGE QuasiQuotes #-}

import Prelude hiding (map, zipWith)
import System.Environment (getArgs)
import Data.Word (Word8)
import Data.Array.Repa
import Data.Array.Repa.IO.DevIL
import Data.Array.Repa.Stencil
import Data.Array.Repa.Stencil.Dim2

main = do
  [s] <- getArgs
  img <- runIL $ readImage s

  let out = output x where RGB x = img
  runIL . writeImage "out.bmp" . Grey =<< computeP out

output img = map cast . blur . blur $ blur grey
  where
    grey              = traverse img to2D luminance
    cast n            = floor n :: Word8
    to2D (Z:.i:.j:._) = Z:.i:.j

---------------------------------------------------------------

luminance f (Z:.i:.j)   = 0.21*r + 0.71*g + 0.07*b :: Float
  where
    (r,g,b) = rgb (fromIntegral . f) i j

blur = map (/ 9) . convolve kernel
  where
    kernel = [stencil2| 1 1 1
                        1 1 1
                        1 1 1 |]

convolve = mapStencil2 BoundClamp

rgb f i j = (r,g,b)
  where
    r = f $ Z:.i:.j:.0
    g = f $ Z:.i:.j:.1
    b = f $ Z:.i:.j:.2

Which takes this much time to process a 640x420 image on my 2Ghz core 2 duo laptop:

real    2m32.572s
user    4m57.324s
sys     0m1.870s

I know something must be quite wrong, because I have gotten much better performance on much more complex algorithms using Repa 2. Under that API, the big improvement I found came from adding a call to 'force' before every array transform (which I understand to mean every call to map, convolve, traverse etc). I cannot quite make out the analogous thing to do in Repa 3 - in fact I thought the new manifestation type parameters are supposed to ensure there is no ambiguity about when an array needs to be forced? And how does the new monadic interface fit into this scheme? I have read the nice tutorial by Don S, but there are some key gaps between the Repa 2 and 3 APIs that are little discussed online AFAIK.

More simply, is there a minimally impactful way to fix the above program's efficiency?

Pavla answered 25/5, 2012 at 0:34 Comment(0)

The new representation type parameters don't automagically force when needed (it's probably a hard problem to do that well) - you still have to force manually. In Repa 3 this is done with the computeP function:

computeP
  :: (Monad m, Repr r2 e, Fill r1 r2 sh e)
  => Array r1 sh e -> m (Array r2 sh e)

I personally really don't understand why it's monadic, because you can just as well use Monad Identity:

import Control.Monad.Identity (runIdentity)
force
  :: (Repr r2 e, Fill r1 r2 sh e)
  => Array r1 sh e -> Array r2 sh e
force = runIdentity . computeP

So, now your output function can be rewritten with appropriate forcing:

output img = map cast . f . blur . f . blur . f . blur . f $ grey
  where ...

with an abbreviation f using a helper function u to aid type inference:

u :: Array U sh e -> Array U sh e
u = id
f = u . force

With these changes, the speedup is quite dramatic - which is to be expected, as without intermediate forcing each output pixel ends up evaluating much more than is necessary (the intermediate values aren't shared).

Your original code:

real    0m25.339s
user    1m35.354s
sys     0m1.760s

With forcing:

real    0m0.130s
user    0m0.320s
sys     0m0.028s

Tested with a 600x400 png, the output files were identical.

Alcohol answered 27/5, 2012 at 11:22 Comment(2)

This is a great answer! I had understood that computeP is the replacement for 'force', but hadn't thought to use it with the identity monad. I appreciate your help. – Pavla 27/5, 2012 at 17:5

I believe the reason for using monadic return types is because the idea of forcing something at is pretty tightly tied to the force-ings happening sequentially. There's a better explanation in cse.unsw.edu.au/~chak/papers/LCKP12.html – Capella 5/7, 2012 at 15:57

computeP is the new force.

In Repa 3 you need to use computeP everywhere you would have used force in Repa 2.

The Laplace example from repa-examples is similar to what you're doing. You should also use cmap instead of plain map in your blur function. There will be a paper explaining why on my homepage early next week.

Schreibman answered 27/5, 2012 at 11:15 Comment(1)

The great thing about the Haskell community - feedback from the library developers themselves :) I eagerly await your paper. – Pavla 27/5, 2012 at 17:10

Recommended topics

Hot tags