Why does ThreadSanitizer report a race with this lock-free example?
Asked Answered
G

2

27

I've boiled this down to a simple self-contained example. The main thread enqueues 1000 items, and a worker thread tries to dequeue concurrently. ThreadSanitizer complains that there's a race between the read and the write of one of the elements, even though there is an acquire-release memory barrier sequence protecting them.

#include <atomic>
#include <thread>
#include <cassert>

struct FakeQueue
{
    int items[1000];
    std::atomic<int> m_enqueueIndex;
    int m_dequeueIndex;

    FakeQueue() : m_enqueueIndex(0), m_dequeueIndex(0) { }

    void enqueue(int x)
    {
        auto tail = m_enqueueIndex.load(std::memory_order_relaxed);
        items[tail] = x;              // <- element written
        m_enqueueIndex.store(tail + 1, std::memory_order_release);
    }

    bool try_dequeue(int& x)
    {
        auto tail = m_enqueueIndex.load(std::memory_order_acquire);
        assert(tail >= m_dequeueIndex);
        if (tail == m_dequeueIndex)
            return false;
        x = items[m_dequeueIndex];    // <- element read -- tsan says race!
        ++m_dequeueIndex;
        return true;
    }
};


FakeQueue q;

int main()
{
    std::thread th([&]() {
        int x;
        for (int i = 0; i != 1000; ++i)
            q.try_dequeue(x);
    });

    for (int i = 0; i != 1000; ++i)
        q.enqueue(i);

    th.join();
}

ThreadSanitizer output:

==================
WARNING: ThreadSanitizer: data race (pid=17220)
  Read of size 4 at 0x0000006051c0 by thread T1:
    #0 FakeQueue::try_dequeue(int&) /home/cameron/projects/concurrentqueue/tests/tsan/issue49.cpp:26 (issue49+0x000000402bcd)
    #1 main::{lambda()#1}::operator()() const <null> (issue49+0x000000401132)
    #2 _M_invoke<> /usr/include/c++/5.3.1/functional:1531 (issue49+0x0000004025e3)
    #3 operator() /usr/include/c++/5.3.1/functional:1520 (issue49+0x0000004024ed)
    #4 _M_run /usr/include/c++/5.3.1/thread:115 (issue49+0x00000040244d)
    #5 <null> <null> (libstdc++.so.6+0x0000000b8f2f)

  Previous write of size 4 at 0x0000006051c0 by main thread:
    #0 FakeQueue::enqueue(int) /home/cameron/projects/concurrentqueue/tests/tsan/issue49.cpp:16 (issue49+0x000000402a90)
    #1 main /home/cameron/projects/concurrentqueue/tests/tsan/issue49.cpp:44 (issue49+0x000000401187)

  Location is global 'q' of size 4008 at 0x0000006051c0 (issue49+0x0000006051c0)

  Thread T1 (tid=17222, running) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x000000027a67)
    #1 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)()) <null> (libstdc++.so.6+0x0000000b9072)
    #2 main /home/cameron/projects/concurrentqueue/tests/tsan/issue49.cpp:41 (issue49+0x000000401168)

SUMMARY: ThreadSanitizer: data race /home/cameron/projects/concurrentqueue/tests/tsan/issue49.cpp:26 FakeQueue::try_dequeue(int&)
==================
ThreadSanitizer: reported 1 warnings

Command line:

g++ -std=c++11 -O0 -g -fsanitize=thread issue49.cpp -o issue49 -pthread

g++ version: 5.3.1

Can anybody shed some light onto why tsan thinks this is a data race?


UPDATE

It seems like this is a false positive. To appease ThreadSanitizer, I've added annotations (see here for the supported ones and here for an example). Note that detecting whether tsan is enabled in GCC via a macro has only recently been added, so I had to manually pass -D__SANITIZE_THREAD__ to g++ for now.

#if defined(__SANITIZE_THREAD__)
#define TSAN_ENABLED
#elif defined(__has_feature)
#if __has_feature(thread_sanitizer)
#define TSAN_ENABLED
#endif
#endif

#ifdef TSAN_ENABLED
#define TSAN_ANNOTATE_HAPPENS_BEFORE(addr) \
    AnnotateHappensBefore(__FILE__, __LINE__, (void*)(addr))
#define TSAN_ANNOTATE_HAPPENS_AFTER(addr) \
    AnnotateHappensAfter(__FILE__, __LINE__, (void*)(addr))
extern "C" void AnnotateHappensBefore(const char* f, int l, void* addr);
extern "C" void AnnotateHappensAfter(const char* f, int l, void* addr);
#else
#define TSAN_ANNOTATE_HAPPENS_BEFORE(addr)
#define TSAN_ANNOTATE_HAPPENS_AFTER(addr)
#endif

struct FakeQueue
{
    int items[1000];
    std::atomic<int> m_enqueueIndex;
    int m_dequeueIndex;

    FakeQueue() : m_enqueueIndex(0), m_dequeueIndex(0) { }

    void enqueue(int x)
    {
        auto tail = m_enqueueIndex.load(std::memory_order_relaxed);
        items[tail] = x;
        TSAN_ANNOTATE_HAPPENS_BEFORE(&items[tail]);
        m_enqueueIndex.store(tail + 1, std::memory_order_release);
    }

    bool try_dequeue(int& x)
    {
        auto tail = m_enqueueIndex.load(std::memory_order_acquire);
        assert(tail >= m_dequeueIndex);
        if (tail == m_dequeueIndex)
            return false;
        TSAN_ANNOTATE_HAPPENS_AFTER(&items[m_dequeueIndex]);
        x = items[m_dequeueIndex];
        ++m_dequeueIndex;
        return true;
    }
};

// main() is as before

Now ThreadSanitizer is happy at runtime.

Grumous answered 31/5, 2016 at 18:14 Comment(4)
Does it make a difference, if you are using sequential consistency for the atomic accesses?Kirksey
No, tsan still reports a race.Grumous
I think your "UPDATE" is actually an answer - and a good one! Consider moving it out of the question and into an answer.Derick
@Toby: Thanks, but really it's just an elaborate example of how to silence a false positive in my specific example code. user1887915 has the real answer (that tsan doesn't currently support this type of code in the first place). Tomato/tomahto ;-)Grumous
G
2

The ThreadSanitizer is not good at counting, it cannot understand that writes to the items always happen before the reads.

The ThreadSanitizer can find that the stores of m_enqueueIndex happen before the loads, but it does not understand that the store to items[m_dequeueIndex] must happen before the load when tail > m_dequeueIndex.

Grantor answered 1/6, 2016 at 8:18 Comment(4)
Is this a design limitation of ThreadSanitizer, or should this behavior be reported as a bug/defect?Tycoon
@VittorioRomeo It is a defect, but is by design. It will keep the current behavior until somebody finds a new algorithm that can handle this case efficiently.Grantor
Ah, I didn't realize ThreadSanitizer could produce false positives. That's not at all clear in the documentation I found :-) The paper you linked describes the original ThreadSanitizer; as I understand it, it's been rewritten as a compiler/runtime-integrated tool instead of one based on valgrind. I'm not sure which parts are still applicable. I'll see if I can annotate my code to make tsan happy.Grumous
ThreadSanitizer can indeed produce false positives, but this is not the case here (see my comment below). Unfortunately the paper about the Valgrind-based ThreadSanitizer is no longer applicable, and there's no "new" paper covering the actual state of the things. You can refer to various talks on YouTube (search for "ThreadSanitizer") for a brief description of how the tool works now.Bernadettebernadina
B
15

This looks like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78158. Disassembling the binary produced by GCC shows that it doesn't instrument the atomic operations on O0. As a workaround, you can either build your code with GCC with -O1/-O2, or get yourself a fresh Clang build and use it to run ThreadSanitizer (this is the recommended way, as TSan is being developed as part of Clang and only backported to GCC).

The comments above are invalid: TSan can easily comprehend the happens-before relation between the atomics in your code (one can check that by running the above reproducer under TSan in Clang).

I also wouldn't recommend using the AnnotateHappensBefore()/AnnotateHappensAfter() for two reasons:

  • you shouldn't need them in most cases; they denote that the code is doing something really complex (in which case you may want to double-check you're doing it right);

  • if you make an error in your lock-free code, spraying it with annotations may mask that error, so that TSan won't notice it.

Bernadettebernadina answered 20/3, 2017 at 13:37 Comment(0)
G
2

The ThreadSanitizer is not good at counting, it cannot understand that writes to the items always happen before the reads.

The ThreadSanitizer can find that the stores of m_enqueueIndex happen before the loads, but it does not understand that the store to items[m_dequeueIndex] must happen before the load when tail > m_dequeueIndex.

Grantor answered 1/6, 2016 at 8:18 Comment(4)
Is this a design limitation of ThreadSanitizer, or should this behavior be reported as a bug/defect?Tycoon
@VittorioRomeo It is a defect, but is by design. It will keep the current behavior until somebody finds a new algorithm that can handle this case efficiently.Grantor
Ah, I didn't realize ThreadSanitizer could produce false positives. That's not at all clear in the documentation I found :-) The paper you linked describes the original ThreadSanitizer; as I understand it, it's been rewritten as a compiler/runtime-integrated tool instead of one based on valgrind. I'm not sure which parts are still applicable. I'll see if I can annotate my code to make tsan happy.Grumous
ThreadSanitizer can indeed produce false positives, but this is not the case here (see my comment below). Unfortunately the paper about the Valgrind-based ThreadSanitizer is no longer applicable, and there's no "new" paper covering the actual state of the things. You can refer to various talks on YouTube (search for "ThreadSanitizer") for a brief description of how the tool works now.Bernadettebernadina

© 2022 - 2024 — McMap. All rights reserved.