Use of std::memory_order_consume in the Folly's lock free SPSC queue
Asked Answered
L

1

10

In the process of trying to understand how to deal with lock free code, I attempted to write a single consumer/single producer lock free queue. As always, I checked papers, articles, and code, especially considering that this is a somewhat delicate subject.

So, I stumbled upon an implementation of this data structure in the Folly library, which can be found here : https://github.com/facebook/folly/blob/master/folly/ProducerConsumerQueue.h

As every lock-free queue I saw, this one seems to use a circular buffer, so we got two std::atomic<unsigned int> variables : readIndex_ and writeIndex_. The readIndex_ indicate the next index at which we will read, and writeIndex_ the next at which we will write. Seems simple enough.

So, the implementation seems clean and pretty simple at first sight, but I found one thing troublesome. Indeed, some functions like isEmpty(), isFull() or guessSize() are using std::memory_order_consume to retrieve the value of the indices.

And to be fair, I really don't know what purpose they do serve. Don't get me wrong, I'm aware of the use of std::memory_order_consume in the classical case of dependency-carrying through an atomic pointer, but here, we do not seem to carry any dependency ! We just got indices, unsigned integers, we do not create dependencies. To me in this scenario, a std::memory_order_relaxed is equivalent.

However, I do not trust myself to understand memory ordering better than those who engineered this code, hence why I ask this question here. Is there anything I missed or misunderstood ?

I thank you in advance for your answers !

Linalool answered 22/3, 2016 at 15:57 Comment(4)
There very few cases where std::memory_order_relaxed will actually make any difference. You really need a chip capable of torn reads/writes. An example is x86 when dealing with unaligned data, but thou shall not. The reason why I am saying all this is that if you think something needs memory_order_relaxed you probably don't understand the usage. Are you trying to prevent torn read/write?Madrepore
@SideEffects: I agree with you. There are no subsequent memory accesses that depend on the index values in these functions, so I do not see how std::memory_order_consume could possibly contribute anything useful. The author(s) Email addresses are at the top of the file; perhaps try Emailing them? (If you do, please add an update or an answer here. I am curious whether we are missing something.)Purebred
@Madrepore Yes, I may have been a bit unclear here. All I wanted to say was that I believed that consume memory ordering was, at least to me, not adding anything useful. But yhea, I may want to clarify this point, thanks !Linalool
@Purebred I will wait a bit before e-mailing them, in order to try to get as much feedback as possible before proceeding. Thanks for your comment !Linalool
M
5

I thought the same thing a few months ago, so I submitted this pull request back in October, suggesting they change the std::memory_order_consume loads into std::memory_order_relaxed since the consume simply did not make sense, as there were no dependencies that could be carried from one thread to another using these functions. It ended up generating some discussion which revealed that a possible use case for isEmpty(), isFull(), and sizeGuess was the following:

//Consumer    
while( queue.isEmpty() ) {} // spin until producer writes 
use_queue(); // At this point, the writes from producer _should_ be visible

Which is why they explained that std::memory_order_relaxed wouldn't be appropriate and std::memory_order_consume would be. However, this is only true because std::memory_order_consume is promoted to std::memory_order_acquire on all compilers that I know of. So although std::memory_order_consume may appear to be providing the proper synchronization, it's quite misleading to leave that in the code and assume that it will remain correct, especially if std::memory_order_consume were to ever be implemented as intended. The above use-case would fail to work on weaker architectures since the appropriate synchronization would not generated.

What they really need is to make those loads std::memory_order_acquire for this to work as intended, which is why I submitted this other pull request some days back. Alternatively, they could take the acquire-loads out of the loop and use a fence at the end:

//Consumer    
while( queue.isEmpty() ) {} // spin until producer writes using relaxed loads
std::atomic_thread_fence(std::memory_order_acquire);
use_queue(); // At this point, the writes from producer _should_ be visible

Either way, std::memory_order_consume is incorrectly used here.

Mede answered 9/4, 2016 at 19:27 Comment(2)
Thanks for taking time to answer ! Coincidentally, I was just about to send them an e-mail, right before you answered. So, it makes more sens, even if the implementation is not correct as defined by the standard. Still a pretty strange guarantee for the 'isEmpty()' method to me.Linalool
I agree, consume semantics on this sort of function seems out of place. Boost even has their equivalent functions using relaxed. I am glad that someone else found it to be as well, I knew I couldn't be the only one!Mede

© 2022 - 2024 — McMap. All rights reserved.