Atomic swap in GNU C++
Asked Answered
T

2

19

I want to verify that my understanding is correct. This kind of thing is tricky so I'm almost sure I am missing something. I have a program consisting of a real-time thread and a non-real-time thread. I want the non-RT thread to be able to swap a pointer to memory that is used by the RT thread.

From the docs, my understanding is that this can be accomplished in g++ with:

// global
Data *rt_data;

Data *swap_data(Data *new_data)
{
#ifdef __GNUC__
    // Atomic pointer swap.
    Data *old_d = __sync_lock_test_and_set(&rt_data, new_data);
#else
    // Non-atomic, cross your fingers.                                          
    Data *old_d = rt_data;
    rt_data = new_data;
#endif
    return old_d;
}

This is the only place in the program (other than initial setup) where rt_data is modified. When rt_data is used in the real-time context, it is copied to a local pointer. For old_d, later on when it is sure that the old memory is not used, it will be freed in the non-RT thread. Is this correct? Do I need volatile anywhere? Are there other synchronization primitives I should be calling?

By the way I am doing this in C++, although I'm interested in whether the answer differs for C.

Thanks ahead of time.

Thermel answered 19/3, 2010 at 15:15 Comment(0)
E
26

Generally don't use volatile when writing concurrent code in C/C++. The semantics of volatile are so close to what you want that it is tempting but in the end volatile is not enough. Unfortunately Java/C# volatile != C/C++ volatile. Herb Sutter has a great article explaining the confusing mess.

What you really want is a memory fence. __sync_lock_test_and_set provides the fencing for you.

You will also need a memory fence when you copy (load) the rt_data pointer to your local copy.

Lock free programming is tricky. If you're willing to use Gcc's c++0x extensions, it's a bit easier:

#include <cstdatomic>

std::atomic<Data*> rt_data;

Data* swap_data( Data* new_data )
{
   Data* old_data = rt_data.exchange(new_data);
   assert( old_data != new_data );
   return old_data;
}

void use_data( )
{
   Data* local = rt_data.load();
   /* ... */
}
Entertain answered 19/3, 2010 at 16:3 Comment(1)
Thank you! I may follow your suggestion of using std::atomic, that's excellent. (I'm not yet well-acquainted with the newest C++0x stuff.) Just out of curiosity, if I use __sync_lock_test_and_set, what is the correct fence to use while reading? (i.e., to make a local copy)Thermel
N
3

Update: This answer is not correct, as I am missing the fact that volatile guarantees that accesses to volatile variables are not reordered, but provides no such guarantees with respect to other non-volatile accesses and manipulations. A memory fence does provide such guarantees, and is necessary for this application. My original answer is below, but do not act on it. See this answer for a good explanation in the hole in my understanding that led to the following incorrect response.

Original answer:

Yes, you need volatile on your rt_data declaration; any time a variable can be modified outside the flow of control of a thread accessing it, it should be declared volatile. While you may be able to get away without volatile since you're copying to a local pointer, volatile at least helps with documentation and also inhibits some compiler optimizations that can cause problems. Consider the following example, adopted from DDJ:

volatile int a;
int b;
a = 1;
b = a;

If it is possible for a to have its value changed between a=1 and b=a, then a should be declared volatile (unless, of course, assigning an out-of-date value to b is acceptable). Multithreading, particularly with atomic primitives, constitutes such a situation. The situation is also triggered with variables modified by signal handlers and by variables mapped to odd memory locations (e.g. hardware I/O registers). See also this question.

Otherwise, it looks fine to me.

In C, I would probably use the atomic primitives provided by GLib for this. They'll use an atomic operation where available and fall back to a slow-but-correct mutex-based implementation if the atomic operations are not available. Boost may provide something similar for C++.

Newlin answered 19/3, 2010 at 15:33 Comment(6)
Volatile has nothing to do with concurrency, they are completely orthogonal. They both deal with forcing load/store and reordering, but the guarantees provided by volatile don't solve concurrent problems.Entertain
@Caspin Yes, volatile is orthogonal to concurrency issues, and volatile alone is not enough. However, it is my understanding that volatile is useful in concurrent programming to make sure that threads see each other's changes. There isn't much difference between a variable being changed by another thread and it being changed by a hardware interrupt - both violate the assumptions necessary for load/store reordering, and volatile tells the compiler that those assumptions do not necessarily hold.Newlin
Orthogonal means that problems volatile solves are not the kind of problems concurrent programming creates. One in particular, volatile's ordering guarantee only applies to the current thread and only with respect to other volatiles. Please read the links provided in my answer as the whole description of volatile is too complex to fit into a comment. Or better yet ask stackoverflow "Why isn't volatile useful for c/c++ concurrent programming?" and I'll provide an in depth answer.Entertain
@Caspin I have read them. The DDJ one, along with the other question I linked to in my edit, lead me to stand by my understanding of volatile.Newlin
The article you linked to is from 2001. There was a lot of confusion about atomics then. Andrei was still wrong then but less people knew it. Please see drdobbs.com/high-performance-computing/212701484 for a more modern view of c/c++ and volatile. In the SO question you linked the accepted answer says yes to volatile but the most up voted answer says no.Entertain
@Caspin I've now made a question for this at stackoverflow.com/questions/2484980/… to try to clear this up. The DDJ article I linked to is the same one you did, though; I just linked to the second section.Newlin

© 2022 - 2024 — McMap. All rights reserved.