My question concerns the use of OpenMP in C++ functions stored in dynamic libraries. Let's consider the following code (in shared.cpp):
#include "omp.h"
#include <iostream>
extern "C" {
int test() {
int N = omp_get_max_threads();
#pragma omp parallel num_threads(N)
{
std::cout << omp_get_thread_num() << std::endl;
}
return 0;
}
};
I compile this code using g++: g++ -fopenmp -shared -fPIC -o shared.so shared.cpp. Then, to use the test function, I have the following program (main.cpp):
#include <iostream>
#include <dlfcn.h>
int main() {
void* handle = dlopen("./shared.so", RTLD_NOW);
if (!handle) {
std::cerr << "can not open shared.so" << std::endl;
return 1;
}
int(*f)() = (int(*)()) dlsym(handle,"test");
if (!f) {
std::cerr << "can not find 'test' symbol in shared.so" << std::endl;
return 1;
}
(*f)();
if (dlclose(handle)) {
std::cerr << "can not close shared.so" << std::endl;
return 1;
}
return 0;
}
compiled with the command: g++ -o main main.cpp -ldl The problem is that a segmentation fault occurs at the very end of the program execution. According to valgrind, some threads are still active at this point, which seems to be coherent with the OpenMP behavior.
One solution (for C code) from this post is to compile the program using the gcc -fopenmp flag, but g++ seems smart enough to detect that OpenMP is never used in that program, and the OpenMP environment is never loaded (the assembly code of both versions is equal). The only workaround I've found is to make a useless call to OpenMP in the program, which forces g++ to load the OpenMP environment, and the execution is then correct. But for me this workaround is quite ugly. I've tried g++-4.8.2, g++-4.8.1, g++-4.7.3 and g++-4.6.4. (With icc-14, using -openmp option on the program actually fix the problem).
Does anyone has ever faced this problem ? Is there a cleaner workaround ? Thanks, Thomas
Edit Tried with G++-4.9.2 : still failing