How do you compute a[i] = f(a[i-1]) in Repa?
Asked Answered
S

2

6

Is it possible to compute an array which depends on the past value(s) (i.e., lesser indexes), in Repa? Initial part(s) of the array (e.g., a[0]) is given. (Note that I am using C-like notation to indicate an element of array; please don't confuse.)

I read the tutorial and quickly check the hackage but I could not find a function to do it.

(I guess doing this kind of computation in 1D array does not make sence in Repa because you can't parallelize it. But I think you can parallelize it in 2 or more dimensional case.)

EDIT: Probably I should be more specific about what kind of f I want to use. As there is no way to parallelize in the case a[i] is a scalar, let's focus on the case a[i] is a N dim vector. I don't need a[i] to be higher dimensional (such as matrix) because you can "unroll" it to a vector. So, f is a function which maps R^N to R^N.

Most of the case, it's like this:

b = M a[i-1]
a[i][j] = g(b)[j]

where b is a N dim vector, M is a N by N matrix (no assumption for sparseness), and g is some nonlinear function. And I want to compute it for i=1,..N-1 given a[0], g and M. My hope is that there are some generic way to (1) parallelize this type of calculation and (2) make allocation of intermediate variables such as b efficient (in C-like language, you can just reuse it, it would be nice if Repa or similar library can do it like a magic without breaking purity).

Snowy answered 27/6, 2012 at 14:36 Comment(7)
For associative f, it can be parallelized and it is called a "scan". en.wikipedia.org/wiki/Prefix_sum I couldn't find scan in the Repa documentation, though.Catholicity
You may be able to do it with a repa stencil, hackage.haskell.org/packages/archive/repa/2.0.2.1/doc/html/… . But see also #6170508Obelia
@Catholicity Wouldn't a scan require a series as input? To me, this looks more like an unfold.Deedradeeds
Oh, it's not a scan. I misread the equation as a[i] = f(a[i], a[i-1]). Actually, it's more like take n $ iterate f z.Catholicity
@Catholicity yes, that is exactly what I meant. Do you think is it possible to do that computation in Repa, especially when f is multi-dimensional? Also, I am afraid that using infinite list and take generate some kind of overhead comparing C-like language where you can allocate memory before the computation. I am hoping that Repa reduce such kind of overhead comparing to bare Haskell list.Snowy
@phg What I had in mind when I wrote the equation above is dynamical systems. Reading the link you gave me now I know that this type of computation is called anamorphism in category theory. It's nice to know similar concept exists in different field. Thanks.Snowy
I just found Data.Array.Repa.Algorithms.Iterate. It does not record intermediate results, but it's close. Probably I can tweak this function to get what I want.Snowy
K
3

I cannot see a Repa way of doing this. But there is for Vector: Data.Vector.iterateN builds the vector you want. Then Data.Array.Repa.fromUnboxed to convert it from Vector to Repa.

iterateN :: Int -> (a -> a) -> a -> Vector aSource

O(n) Apply function n times to value. Zeroth element is original value.

Komatik answered 28/6, 2012 at 10:14 Comment(9)
Thanks. This should be enough when doing 1D calculation. Probably I don't need to care about Repa. I just want something comparable to (non-optimized) C in terms of speed in Haskell. Can this be multidimensional? (Can a be another vector?) And if so, can it be parallelize?Snowy
Well, it depends what sort of multidimensional generalization you're asking about. In the 1D case, it's obviously impossible to parallelize, since the function itself is pretty inherently sequential. Can you be more specific about what the multidimensional generalization you're asking about would look like?Ilium
Sure. Let's say a[i] is a vector (so a is a 2D array) and I want compute a[i] = M a[i-1] given an initial value a[0] and a matrix M, up to i=N-1. Using the notation I used for the title, its f(x) = M x where f is vectorized now. You can think of any kind of f, but basically it uses only the value of one step before.Snowy
Well, if you're in the lucky case of f being linear, there's several options for optimizing matrix multiplication, which I'd try first. But that depends on the form of M. Since I don't assume that your M is sparse (or is it?), you could look for something like this algorithm suitable for your kind of matrix. In general, though, every a[i][j] can depend on whole a[i-1], so I guess there's little you can parallelize for arbitrary matrices.Deedradeeds
You are right, I was not assuming sparse M. I should have clarified. I thought Repa is able to parallelize any kind of vector to vector computation, not only the linear one. Am I wrong? I know you cannot parallelize computation across steps, but I thought it might be possible to parallelize the computation within the step (e.g., vector to vector computation).Snowy
@Snowy I just found out that Repa has a parallel matrix multiplication function. I think with that you could do something like mi = unfoldr (\mi -> Just (mi, mmultP mi m)) m (which is lazy) and then get the a[i] by as = zipWith mult mi (repeat a0). Is that what you want? The whole thing is created once, and the intermediate steps are calculated on request using a parallelized operation.Deedradeeds
Oh, and mmult is monadic, but certainly there's a workaround for that. Also, I assumed the vector/matrix multiplication mult to be provided somewhere. Probably that could be parallelized, too.Deedradeeds
@phg Doesn't calculating series of matrices make this slower? Because it's O(N^3 L) but you can do it with O(N^2 L) if you do vector to matrix multiplication (N: size of vector; L: length of sequence). Also, I want a function which can apply general f. Probably the linear example was not good. I wanted say that a[i][j] depends on whole (or subset of) a[i-1], not just a[i-1][j], for example. But mmultP is worth knowing, because f is typically multiplication with matrix and some complicated nonlinear function on the each element of the resulting vector. Thanks for that.Snowy
@phg (and others) I updated the question because I notice I need to be more specific about the calculation I want.Snowy
K
1

Edit: Actually, I think I misinterpreted the question. I'll leave my answer here, in case it's useful for someone else...

You can use traverse http://hackage.haskell.org/packages/archive/repa/3.2.1.1/doc/html/Data-Array-Repa.html#v:traverse:

Prelude Data.Array.Repa R> let x = fromListUnboxed (Z :. 10 :: DIM1) [1..10]
Prelude Data.Array.Repa R> R.computeUnboxedS $ R.traverse x (\ (Z :. i) -> (Z :. (i - 1))) (\f  (Z :. i) -> f (Z :. (i + 1)) - f (Z :. i))
AUnboxed (Z :. 9) (fromList [1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0])

Dissecting it:

    R.computeUnboxedS $                            -- force the vector to be "real"
    R.traverse x                                   -- traverse the vector
    (\ (Z :. i) -> (Z :. (i - 1)))                 -- function to get the shape of the result
    (\f (Z :. i) -> f (Z :. (i + 1)) - f (Z :. i)) -- actual "stencil"

Extending it to multi-dimensional array should be trivial.

Kathrynkathryne answered 27/6, 2012 at 21:5 Comment(3)
You are calculating the result array based only on the source array. What I want to calculate is an array whose element depends on its one step earlier one. You might think the i in the title as the size of array. I meant the index.Snowy
What about using a combination of Data.Vector.Unboxed.scanl1' and Repa.toUnboxed? For example, cumsum would be: R.fromUnboxed (R.extent y) $ U.scanl1' (+) $ (R.toUnboxed y). [Ugly as hell, and probably not extendible to higher dimensions.]Kathrynkathryne
cumsum is still calculation from source array to target array. It depends on the earlier part of the source array, not the target. As Heatsink and phg mentioned in the above comments, unfold is what I need. I'd like to know how to do unfold in Repa (or any other array-oriented library in Haskell).Snowy

© 2022 - 2024 — McMap. All rights reserved.