OpenMP reduction with Eigen::VectorXd

#define EIGEN_DONT_PARALLELIZE #include <iostream> #include <cmath> #include <string> #include <eigen3/Eigen/Dense> #include <eigen3/Eigen/Eigenvalues> #include <omp.h> using namespace Eigen; using namespace std; VectorXd integrand(double E) { VectorXd answer(500000); double f = 5.*E + 32.*E*E*E*E; for (int j = 0; j !=50; j++) answer[j] =j*f; return answer; } int main() { omp_set_num_threads(4); double start = 0.; double end = 1.; int n = 100; double h = (end - start)/(2.*n); VectorXd result(500000); result.fill(0.); double E = start; result = integrand(E); #pragma omp parallel { #pragma omp for nowait for (int j = 1; j <= n; j++){ E = start + (2*j - 1.)*h; result = result + 4.*integrand(E); if (j != n){ E = start + 2*j*h; result = result + 2.*integrand(E); } } } for (int i=0; i <50 ; ++i) cout<< i+1 << " , "<< result[i] << endl; return 0; }

Your code does not define a custom reduction for OpenMP to reduce the Eigen objects. I'm not sure if clang supports user defined reductions (see OpenMP 4 spec, page 180). If so, you can declare a reduction and add reduction(+:result) to the #pragma omp for line. If not, you can do it yourself by changing your code as follows:

VectorXd result(500000); // This is the final result, not used by the threads
result.fill(0.);
double E = start;
result = integrand(E);
#pragma omp parallel
{
    // This is a private copy per thread. This resolves race conditions between threads
    VectorXd resultPrivate(500000);
    resultPrivate.fill(0.);
#pragma omp for nowait// reduction(+:result) // Assuming user-defined reductions aren't allowed
    for (int j = 1; j <= n; j++) {
        E = start + (2 * j - 1.)*h;
        resultPrivate = resultPrivate + 4.*integrand(E);
        if (j != n) {
            E = start + 2 * j*h;
            resultPrivate = resultPrivate + 2.*integrand(E);
        }
    }
#pragma omp critical
    {
        // Here we sum the results of each thread one at a time
        result += resultPrivate;
    }
}

The error you're getting (in your comment) seems to be due to a size mismatch. While there isn't a trivial one in your code itself, don't forget that when OpenMP starts each thread, it has to initialize a private VectorXd per thread. If none is supplied, the default would be VectorXd() (with a size of zero). When this object is the used, the size mismatch occurs. A "correct" usage of omp declare reduction would include the initializer part:

#pragma omp declare reduction (+: VectorXd: omp_out=omp_out+omp_in)\
     initializer(omp_priv=VectorXd::Zero(omp_orig.size()))

omp_priv is the name of the private variable. It gets initialized by VectorXd::Zero(...). The size is specified using omp_orig. The standard (page 182, lines 25-27) defines this as:

The special identifier omp_orig can also appear in the initializer-clause and it will refer to the storage of the original variable to be reduced.

In our case (see full example below), this is result. So result.size() is 500000 and the private variable is initialized to the correct size.

#include <iostream>
#include <string>
#include <Eigen/Core>
#include <omp.h>

using namespace Eigen;
using namespace std;

VectorXd integrand(double E)
{
    VectorXd answer(500000);
    double f = 5.*E + 32.*E*E*E*E;
    for (int j = 0; j != 50; j++)   answer[j] = j*f;
    return answer;
}

#pragma omp declare reduction (+: Eigen::VectorXd: omp_out=omp_out+omp_in)\
     initializer(omp_priv=VectorXd::Zero(omp_orig.size()))

int main()
{
    omp_set_num_threads(4);
    double start = 0.;
    double end = 1.;
    int n = 100;
    double h = (end - start) / (2.*n);

    VectorXd result(500000);
    result.fill(0.);
    double E = start;
    result = integrand(E);

#pragma omp parallel for reduction(+:result)
    for (int j = 1; j <= n; j++) {
        E = start + (2 * j - 1.)*h;
        result += (4.*integrand(E)).eval();
        if (j != n) {
            E = start + 2 * j*h;
            result += (2.*integrand(E)).eval();
        }
    }
    for (int i = 0; i < 50; ++i)
        cout << i + 1 << " , " << result[i] << endl;

    return 0;
}

Recommended topics

Hot tags