Parallel more than one nested loops with tbb
Asked Answered
T

1

8

What is the best way to parallel three nested independent loops with tbb?

for(int i=0; i<100; i++){
    for(int j=0; j<100; j++){
        for(int k=0; k<100; k++){
            printf("Hello World \n");
        }
     }
 }
Tical answered 18/4, 2015 at 16:18 Comment(2)
They're not independent if they're nested.Karwan
Sorry, I mean they don't have any dependencies so you can parallelize themTical
L
14

There are basically two ways for nested loops in TBB.

  1. Since TBB is designed to perfectly support nested parallelism, just write nested parallel fors:

    tbb::parallel_for(0, 100, [](int i){
        tbb::parallel_for(0, 100, [](int j){
            tbb::parallel_for(0, 100, [](int k){
                printf("Hello World %d/%d/%d\n", i, j, k);
            });
        });
    });
    

    This variant works well when the loops belong to different modules or/and libraries.

  2. Otherwise, collapse two or three nested loops using blocked_range2d or blocked_range3d. It can additionally help to optimize cache locality and thus increase performance even on a single thread when accessing arrays:

    tbb::parallel_for( tbb::blocked_range3d<int>(0, 100, 0, 100, 0, 100),
        []( const tbb::blocked_range3d<int> &r ) {
            for(int i=r.pages().begin(), i_end=r.pages().end(); i<i_end; i++){
                for(int j=r.rows().begin(), j_end=r.rows().end(); j<j_end; j++){
                    for(int k=r.cols().begin(), k_end=r.cols().end(); k<k_end; k++){
                        printf("Hello World %d\n", matrix3d[i][j][k]);
                    }
                }
            }
    });
    
Lermontov answered 19/4, 2015 at 8:36 Comment(2)
Thank you! I ran both of them and it's true that the second one has better performance.Tical
compilation fix: use r.pages(),r.rows(),r.cols()Insane

© 2022 - 2024 — McMap. All rights reserved.