I am attempting to parallelize the below loop with an OpenMP reduction;
#define EIGEN_DONT_PARALLELIZE
#include <iostream>
#include <cmath>
#include <string>
#include <eigen3/Eigen/Dense>
#include <eigen3/Eigen/Eigenvalues>
#include <omp.h>
using namespace Eigen;
using namespace std;
VectorXd integrand(double E)
{
VectorXd answer(500000);
double f = 5.*E + 32.*E*E*E*E;
for (int j = 0; j !=50; j++)
answer[j] =j*f;
return answer;
}
int main()
{
omp_set_num_threads(4);
double start = 0.;
double end = 1.;
int n = 100;
double h = (end - start)/(2.*n);
VectorXd result(500000);
result.fill(0.);
double E = start;
result = integrand(E);
#pragma omp parallel
{
#pragma omp for nowait
for (int j = 1; j <= n; j++){
E = start + (2*j - 1.)*h;
result = result + 4.*integrand(E);
if (j != n){
E = start + 2*j*h;
result = result + 2.*integrand(E);
}
}
}
for (int i=0; i <50 ; ++i)
cout<< i+1 << " , "<< result[i] << endl;
return 0;
}
This is definitely faster in parallel than without, but with all 4 threads, the results are hugely variable. When the number of threads is set to 1, the output is correct. I would be most grateful if someone could assist me with this...
I am using the clang compiler with compile flags;
clang++-3.8 energy_integration.cpp -fopenmp=libiomp5
If this is a bust, then I'll have to learn to implement Boost::thread
, or std::thread
...
firstprivate(params) reduction(+:result_int)
to yourparallel
directive, remove thecritical
and try again... – Deliberate#pragma
statement reads#pragma omp parallel firstprivate(params) reduction(+:result_int)
, the second#pragma
statement remains as is, and all subsequent#pragma
statements are removed. The program then yields the runtime error:....const Eigen::Matrix<double, -1, 1, 0, -1, 1> >]: Assertion aLhs.rows() == aRhs.rows() && aLhs.cols() == aRhs.cols()' failed. Aborted
- I can assure that both kspace and result_int have the same number of elements and dimensionality – Indictmentintegrand(...)
nor is your code compilable. You also have not answered if the serial version returns the correct result. – Ship