I am attempting to implement a cumulative sum function using Repa in order to calculate integral images. My current implementation looks like the following:
cumsum :: (Elt a, Num a) => Array DIM2 a -> Array DIM2 a
cumsum array = traverse array id cumsum'
where
elementSlice inner outer = slice array (inner :. (0 :: Int))
cumsum' f (inner :. outer) = Repa.sumAll $ elementSlice inner outer
The problem is in the elementSlice function. In matlab or say numpy this could be specified as array[inner,0:outer]. So what I am looking for is something along the lines of:
slice array (inner :. (Range 0 outer))
However, it seems that slices are not allowed to be specified over ranges currently in Repa. I considered using partitions as discussed in Efficient Parallel Stencil Convolution in Haskell, but this seems like a rather heavyweight approach if they would be changed every iteration. I also considered masking the slice (multiplying by a binary vector) - but this again seemed like it would perform very poorly on large matrices since I would be allocating a mask vector for every point in the matrix...
My question - does anyone know if there are plans to add support for slicing over a range to Repa? Or is there a performant way I can already attack this problem, maybe with a different approach?