Is this hazard pointer example flawed because of ABA issue?
Asked Answered
B

1

6

In the book C++ Concurrency in Action, the author gave an example of using hazard pointer to implement a lock-free stack data structure. Part of the code is as follows:

std::shared_ptr<T> pop()
{
    std::atomic<void*>& hp=get_hazard_pointer_for_current_thread();
    node* old_head=head.load();
    node* temp;
    do
    {
        temp=old_head;
        hp.store(old_head);
        old_head=head.load();
    } while(old_head!=temp);
    // ...
}

The description says that

You have to do this in a while loop to ensure that the node hasn’t been deleted between the reading of the old head pointer and the setting of the hazard pointer. During this window no other thread knows you’re accessing this particular node. Fortunately, if the old head node is going to be deleted, head itself must have changed, so you can check this and keep looping until you know that the head pointer still has the same value you set your hazard pointer to.

I think the code is flawed because the head node is subject to the ABA problem. Even if the value of head remains the same, the node it originally points to may have been deleted. A new head node is allocated, which happens to have the same address value as the previous one.

Bedell answered 14/2, 2018 at 9:26 Comment(4)
I agree with you but I couldn't prove you right except with a unit test running for hours until the ABA problem occurs. In order to avoid ABA here, I'd compare node's values, not addresses.Sardanapalus
Unless I'm mistaken, the code you're showing us merely touches the hazard pointer which only contains the address of something that is currently read by the thread and shouldn't be messed with in other threads. I don't really understand how there could be a ABA problem with just that code since you never actually look into what's pointed by head here.Kokanee
In other words, that code never seems to assume that head == old_head implies *head hasn't changed, which unless I'm mistaken is the problematic assumption which causes ABA problems.Kokanee
@Kokanee On a second thought, you're right on spot.Bedell
I
5

The default memory_order for load() operations is std::memory_order_seq_cst, which ensures sequential consistency of all operations (total global ordering):

Each memory_order_seq_cst operation B that loads from atomic variable M, observes one of the following:

  • the result of the last operation A that modified M, which appears before B in the single total order
  • OR, if there was such an A, B may observe the result of some modification on M that is not memory_order_seq_cst and does not happen-before A
  • OR, if there wasn't such an A, B may observe the result of some unrelated modification of M that is not memory_order_seq_cst.

So, if the node is modified (deleted) and this happens before second read in the total global order, you are guaranteed to see that change and thus loop will continue to execute. If this modification is ordered after, there is no harm since hazard pointer has been already set.

You have this guarantee, because store to hazard pointer is also done with std::memory_order_seq_cst. This memory order gives you an acquire operation for loads and release operation for stores, preventing from reordering within the same thread. Thus, "successful" read (old_head==temp) guarantees that proper data was saved.

Treat these two loads as sync points - since they perform an acquire operation, they are synchronized-with respective release operations that modify those values, causing all writes to become visible.


The issue you described does not flaw the example in any way. pop() function is meant to remove top element and it will do it. If, in the meantime, element is added/removed, it will pop it, no matter what its address is (it may even be the same the one previously fetched). This is a totally different problem. Consider:

concurrent_stack<int> p;
if (!p.empty() && (p.top() == 5))
{
  auto t = p.pop();
  assert( t ); // May fail
  assert( *t == 5 ); // May fail
}

Both assertions may fail and in case many threads use the stack very intensively, most likely will fail quite often. But this is not due to incorrect implementation of pop(), but the fact, that you need stronger access restriction to ensure that last checked element is indeed removed from the stack.

Interfluent answered 14/2, 2018 at 9:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.