Helgrind (Valgrind) and OpenMP (C): avoiding false positives?
Asked Answered
P

3

40

The documentation for the Valgrind thread error detection tool Helgrind, found here

warns that, if you use GCC to compile your OpenMP code, GCC's OpenMP runtime library (libgomp.so) will cause a chaos of false positive reports of data races, because of its use of atomic machine instructions and Linux futex system calls instead of POSIX pthreads primitives. It tells you that you can solve this problem, however, by recompiling GCC with the --disable-linux-futex configuration option.

So I tried this. I compiled and installed to a local directory (~/GCC_Valgrind/gcc_install) a new GCC version 4.7.0 (the latest release as of this writing) with the --disable-linux-futex configuration option. I then created a small OpenMP test program (test1.c) that has no visible data races:

/* test1.c */

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

#define NUM_THREADS 2

int a[NUM_THREADS];

int main(void) {
        int i;
#pragma omp parallel num_threads(NUM_THREADS)
        {
                int tid = omp_get_thread_num();
                a[tid] = tid + 1;
        }
        for (i = 0; i < NUM_THREADS; i++)
                printf("%d ", a[i]);
        printf("\n");
        return EXIT_SUCCESS;
}

I compiled this program as follows

~/GCC_Valgrind/gcc_install/bin/gcc -Wall -fopenmp  -static -L~/GCC_Valgrind/gcc_install/lib64 -L~/GCC_Valgrind/gcc_install/lib -o test1 test1.c

However, I got 30 false positive data race reports!--all occurring in libgomp code. I then compiled test1.c without the -static flag, and ran Helgrind on it again. This time, I got only 9 false positive data race reports, but that is still too many--and, without the -static flag, I cannot trace the supposed race in the libgomp code.

Has anybody found a way to reduce, if not eliminate, the number of false positive data race reports from Helgrind applied to an OpenMP program compiled with GCC? Thanks!

Pesthole answered 17/5, 2012 at 19:5 Comment(7)
Just a wild guess - could it be that your recompiled gcc links against the recompiled version of libgomp but the dynamic linker still loads the system supplied libgomp at runtime? Try to recompile with -Wl,-rpath,/path/to/recompiled/lib.Barbey
Just a side comment - give a try to the Thread Analyzer tool from Oracle Solaris Studio for Linux while the toolset is still free :)Barbey
Have you looked at adding error suppressions? valgrind.org/docs/manual/manual-core.html#manual-core.suppressSlapdash
Just to make sure, could you mark tid as private?Dogvane
@Dogvane How do you do that?Tombac
@undefinedbehaviour I now realize that tid is supposed to be private, since it is declared inside the parallel section. Anyway, the syntax would have been #pragma omp parallel private(tid).Dogvane
From the website you mentioned: " Fortunately, this can be solved using a configuration-time option (for GCC). Rebuild GCC from source, and configure using --disable-linux-futex. This makes libgomp.so use the standard POSIX threading primitives instead. Note that this was tested using GCC 4.2.3 and has not been re-tested using more recent GCC versions. We would appreciate hearing about any successes or failures with more recent versions." Did you report your problem there?Cleveland
C
2

Sorry to put this in as an answer since it's more of a comment, but it's too long to fit in as a comment, so here goes:

From the site you referenced.

Runtime support library for GNU OpenMP (part of GCC), at least for GCC versions 4.2 and 4.3. The GNU OpenMP runtime library (libgomp.so) constructs its own synchronisation primitives using combinations of atomic memory instructions and the futex syscall, which causes total chaos since in Helgrind since it cannot "see" those.

Fortunately, this can be solved using a configuration-time option (for GCC). Rebuild GCC from source, and configure using --disable-linux-futex. This makes libgomp.so use the standard POSIX threading primitives instead. Note that this was tested using GCC 4.2.3 and has not been re-tested using more recent GCC versions. We would appreciate hearing about any successes or failures with more recent versions.

as you mentioned in your post, this has to do with libgomp.so, but that's a shared object, so I don't see how you can pass the -static flag and still use that library. Am I just misinformed?

Cleveland answered 31/5, 2013 at 18:10 Comment(0)
O
0

Steps which will make it work:

  1. Recompile gcc (including libgomp) using --disable-linux-futex
  2. Make sure you use the futex free gcc when compiling your program.
  3. Make sure the system will load the futex free libgomp when executing your program (the library is usually in GCC-OBJ-DIR/PLATFORM/libgomp/.libs). For example by setting the LD_LIBRARY_PATH environment variable:

export LD_LIBRARY_PATH=~/gcc-4.8.1-nofutex/x86_64-unknown-linux-gnu/libgomp/.libs:

Oven answered 14/10, 2013 at 11:27 Comment(0)
N
0

Please also note, that if omp_set_lock is used in the code the omp.h path must be substituted because of different lock struct size. See https://xrunhprof.wordpress.com/2018/08/27/tsan-with-openmp/

Nafis answered 1/6, 2020 at 13:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.