OpenMP min reduction and std::min
Asked Answered
M

3

7

I am testing OpenMP min reduction. If I write my code like the following, it will return the correct result: res = 3.

#include <omp.h>
#include <iostream>
#include <algorithm>

int main(){

    omp_set_num_threads(5);

    float res=10;

#pragma omp parallel for default(shared) reduction(min:res)
    for(int i1 = 0; i1 <= 10; i1++)
        for(int i0 = 0; i0 <= 10; i0++)
            if(res > 3.0+i1+20*i0)
                res = 3.0+i1+20*i0;

    std::cout << "res = " << res << std::endl;

    return 0;
}

But If I write in an alternative way by replacing "if" statement with "std::min", then the result is wrong: res = 10.

#include <omp.h>
#include <iostream>
#include <algorithm>

int main(){

    omp_set_num_threads(5);

    float res=10;

#pragma omp parallel for default(shared) reduction(min:res)
    for(int i1 = 0; i1 <= 10; i1++)
        for(int i0 = 0; i0 <= 10; i0++)
            res = std::min(res,static_cast<float>(3.0+i1+20*i0));

    std::cout << "res = " << res << std::endl;

    return 0;
}

Is OpenMP min reduction interfering with std::min?

Malonylurea answered 8/9, 2016 at 14:30 Comment(7)
Interesting question. I can't test it right now. reduction(min: should work since OpenMP 3.1. Another interesting question is if it works with fmin. I think I saw an example of this in the manual. fmin and std::min are not the same thing despite what some people think.Transmitter
After reading a bit more the standard, it is explicitly mentioned that min should be implemented like this out=in<out?in:out; Maybe on your system std:min() doesn't abide to this...Centroid
The OpenMP 3.1 manual gives an example on page 266 using fmaxf so I think it works for fmin and fmax as well. I though std::min expanded to a ternary operator so I am not sure why it would fail. You would have to look at the macro.Transmitter
here I explained why std::min and fmin are not the same.Transmitter
std::min can be implemented in many different ways en.cppreference.com/w/cpp/algorithm/min. What you can do is create a custom Reduction operator with OpenMP4.0 which will work with std::min.Transmitter
I just tested your code and I don't see a problem. What result do you get with std::min? What compiler are you using? What are your compiler options?Transmitter
I am using icpc(16.0.0) on OSX. No other compiler options except '-openmp'. OpenMP version is 201307(4.0). If res is initialized to 100, res = 100. But if res is initialized to be greater than 205, res = 205. fmin outputs the correct result on my mac.Malonylurea
M
2

First of all, your code is conforming: it shouldn't matter which kind of code you have inside the parallel for.

What the reduction clause implies is that each thread will have its own private copy initialized to the neutral element of the min operator (i.e. the largest representable number in the reduction item type) and they will work with it until the end of the construct. At that point, these private copies will be reduced to the original list item using the reduction-identifier, which in your case is the min operator. So there is no race-condition here.

I have executed your code using the same version as you did and it worked fine: icpc (ICC) 16.0.0 and OpenMP version 201307. Could this issue be related to the C++ standard headers that you are using?

Maidy answered 14/9, 2016 at 15:32 Comment(0)
K
-1

I would swap the loops before drawing a conclusion. In similar context, I have found separate inner and outer loop reduction variables necessary. -std::min works well with icpc but not g++ or msvc

Kagoshima answered 8/9, 2016 at 15:3 Comment(0)
F
-1

The problem is you have a data race.

In your first example, the minimum value among all the threads is being calculated by the OpenMP runtime: Each thread gets its own res and the runtime determines the minimum value. The runtime makes sure res is read and written correctly by all threads.

In your second example, each thread is calling std::min to determine the minimum value between res and the other value. res is shared because of your default(shared) clause, so all the threads will be trying to use and update res concurrently, which is a data race.

You should stay with your first example. If you want to use std::min, you must prevent the data race using a lock or something similar.

Felon answered 8/9, 2016 at 18:46 Comment(3)
Thanks! But why res within the "if" statement is private while res used in std::min is shared? Here (computing.llnl.gov/tutorials/openMP/#REDUCTION) it indicates as long as res is a reduction variable, each thread will create its own private copy.Malonylurea
Oh yes, I didn't notice you were still using the reduction clause, redundantly, with the std::min inside the loop. In that case, res is private to each variable and their copies are never updated/written to the res declared before the parallel region, which is why res after the parallel region is 10, which is the value assigned before.Felon
But why res is updated in the first case? In the second case if I initialize res to 10000, the result would be 205, which corresponds to i0=1,i1=2...Malonylurea

© 2022 - 2024 — McMap. All rights reserved.