How is Loop skewing making loop parallelizable?
Asked Answered
P

1

10

I am reading about loop tranformation techniques and i have a very hard time understanding how does loop skewing makes a loop parallelizable Here are are two loops, the initial one and the seconde one, what is the difference betwwen the two ? The two of them depends on previous iteration on both i and j, what make the second loop parallisable ? Or why can we make the interchange on the second one and not the first one ? Both of them have dependencies on i and j

for(int i =2; i < 5; i++){
            for(int j =2; j < 5; j++){
                A[i][j] = A[i-1][j] + A[i][j-1];
            }
        }
for(int i =2; i < 5; i++){
            for(int j =2+i; j < 5+i; j++){
                A[i][j-i] = A[i-1][j-i] + A[i][j-1-i];
            }
        }
Plantaineater answered 16/4, 2014 at 0:1 Comment(0)
T
6

I have no credit for this, I just formatted it for you and copied it from another source, and I hope it help you

[source: ECE 1754, Survey of Loop Transformation Techniques, Eric LaForest, March 19, 2010]

It is all about the distance between a two executive iterations, in the first the distance is 1 between one outer loop and inner loop, so there is a dependency between them.

Loop skewing does exactly what it says: it skews the execution of an inner loop relative to an outer one. This is useful if the inner loop has a dependence on the outer loop which prevents it from running in parallel. For example, the following code has a dependency vector of {(1, 0),(0, 1)} .Neither loop can be parallelized since they each carry a dependency. Simply interchanging the loops would merely interchange the indices holding the dependencies, accomplishing nothing.

do i = 2, n-1
do j = 2, m-1
a[i,j] =
      (a[i-1,j] + a[i,j-1] + a[i+1,j] + a[i,j+1]) / 4
end do
end do

Loop skewing is implemented by adding the index of the outer loop, times some skewing factor f, to the bounds of the inner loop and subtracting the same value from all the uses of the inner loop index. The subtraction keeps the indices within the new loop bounds, preserving the correctness of the program. The effect on the inner loop iterations is to shift their position in the array forwards by f relative to the current outer loop, increasing the dependency distance to the outer loop in the same manner. In other words, given a dependency vector (a, b), skewing transforms it to (a, f a + b). Since this transformation preserves the lexicographic order of the dependencies, it is always legal. Applying a skew factor of one to the above inner loop yields the following code:

do i = 2, n-1
do j = 2+i, m-1+i
a[i,j-i] =
(a[i-1,j-i] + a[i,j-1-i] + a[i+1,j-i] + a[i,j+1-i]) / 4
end do
end do

This new code executes in the same manner, but with dependencies of {(1, 1),(0, 1)}. Both loops still carry a dependency. However, interchanging the loops at this point yields a dependence vector {(1, 0),(1, 1)}, as shown in the following code:

do j = 4, m+n-2
do i = max(2, j-m+1), min(n-1, j-2)
a[i,j-i] =
(a[i-1,j-i] + a[i,j-1-i] + a[i+1,j-i] + a[i,j+1-i]) / 4
end do
end do

The inner loop can now be parallelized since it has now no loop-carried dependency on j, and the dependency to i is carried by the outer loop.Note that interchanging skewed loop bounds is no longer straightforward: each loop must take into account the upper and lower bounds of the other loop.

Tanaka answered 22/9, 2014 at 21:52 Comment(1)
Thank you for your comment, I updated the answer with the link of the original source. I didn't want the asking person to be confused and extracted the just needed info for him.Tanaka

© 2022 - 2024 — McMap. All rights reserved.