Lock-free stack pop implementation in C++

Asked 27/5, 2020 at 19:20 Answered 28/5, 2020 at 10:34

Solved c++multithreading data-structures concurrency lock-free

In C++ Concurrency in Action, 2e, The author describes a lock-free thread safe linked list implementation. Right now, he is describing the pop() method and how to safely delete the nodes in a sort of "garbage-collector" like method to make sure that no other threads called pop on the same instance. Here is some of that code for pop:

#include <atomic>
#include <memory>

template<typename T>
class lock_free_stack
{
private:
    std::atomic<unsigned> threads_in_pop;
    void try_reclaim(node* old_head);
public:
    std::shared_ptr<T> pop()
    {
        ++threads_in_pop;
        node* old_head=head.load();
        while(old_head &&
              !head.compare_exchange_weak(old_head,old_head->next));
        std::shared_ptr<T> res;
        if(old_head)
        {
            res.swap(old_head->data);
        }
        try_reclaim(old_head);
        return res;
    }
};

The important thing is that a counter is incremented atomically every time pop() is called. Then, the try_reclaim function will decrement said counter. Here is try_reclaim's implementation:

void try_reclaim(node* old_head)
{
    if(threads_in_pop==1) //#1
    {
        node* nodes_to_delete=to_be_deleted.exchange(nullptr);
        if(!--threads_in_pop) //#2
        {
            delete_nodes(nodes_to_delete);
        }
        else if(nodes_to_delete)
        {
            chain_pending_nodes(nodes_to_delete);//#3
        }
        delete old_head; //#4 THIS IS THE PART I AM CONFUSED ABOUT
    }
    else
    {
        chain_pending_node(old_head);
        --threads_in_pop;
    }
}

The implementations of the other functions called here are irrelevant (they just add nodes to a chain of nodes to be deleted), so I have omitted them. The part I am confused about in the code is #4 (marked). Here, the author calls delete on the old_head that was passed in. However, why doesn't he check to see if threads_in_pop is still zero at this point before deleting old_head? He double checks in lines #2 and #1 to make sure another thread isn't currently in pop(), so why doesn't he check again before proceeding to delete old_head? Couldn't another thread have possibly called pop() right after #3, thus incrementing the counter, and by the time the first thread reaches #4, then threads_in_pop would no longer be zero?

That is to say, couldn't it be possible that threads_in_pop is, for example, 2, by the time the code reaches #4? How could he safely delete old_head in that case? Could someone explain please?

Exasperation answered 27/5, 2020 at 19:20 Comment(1)

Doesn't, apart from the garbage collection, the pop mechanic itself suffer from the very ABA problem that literally every lock free list implementation suffers from at the first attempt? – Herakleion 23/10, 2022 at 9:9

The thread that calls try_reclaim has just removed old_head from the stack.

The class ensures that any other uses of old_head must be inside pop calls from other threads, so if the thread discovers that there are no other concurrent calls, then it knows that it is the exclusive holder of the old_head pointer. Then, as long as it doesn't publish that pointer so that it might be picked up from another thread, it can delete it whenever it gets around to it.

So the implementation is safe. The question you asked: "Why doesn't he check [again]" indicates that you are thinking about it incorrectly. Checking again wouldn't prove anything, because if it were possible for another thread to get into pop and use old_head, then it could always happen after you check!

Cassidycassie answered 27/5, 2020 at 23:3 Comment(0)

You have the following (simplified) sequence, and all atomic operations are sequentially consistent:
++threads_in_pop -> head.cmpxchg -> threads_in_pop.load() -> delete old_head

So we first remove the current head, and later check the number of threads_in_pop. Suppose we have two threads, T1 and T2, that are operating on the stack. If T1 performs threads_in_pop.load() (#1) in try_reclaim and sees 1, this implies that T2 has not yet performed the increment (++threads_in_pop), i.e., T1 is the only thread that can have a reference to old_head at that point. However, T1 has already removed old_head from the list, so any thread that enters pop will already see the updated head, so no other thread can possibly obtain a reference to old_thread anymore. It is therefore safe to delete old_head.

The check #2 is necessary to avoid releasing nodes that have just been added to the to_be_released list after this thread has performed its pop, but to which other threads might still hold a reference. Consider the following situation:

T1 performs a pop and is about to execute nodes_to_delete=to_be_deleted.exchange(nullptr);
T2 starts pop and reads head
T3 starts pop and reads head, seeing the same value as T2
T2 finishes its pop and adds old_head to the list
NOTE: T3 still has a reference to the node that is now part of the to_be_deleted list
T1 now executes nodes_to_delete=to_be_deleted.exchange(nullptr);

T1 now has a list of nodes_to_delete that contains a reference to a node that can still be accessed by T3. That's why check #2 is necessary to prevent T1 from releasing that node.

Inorganic answered 28/5, 2020 at 10:34 Comment(0)

Recommended topics

Hot tags