I am studying Michael & Scott's lock-free queue algorithm and trying to implemented it in C++.
But I produced a race in my code and think there may be a race in the algorithm.
I read the paper here: Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms and the original Dequeue pseudo-code is as following:
dequeue(Q: pointer to queue_t, pvalue: pointer to data type): boolean
D1: loop // Keep trying until Dequeue is done
D2: head = Q->Head // Read Head
D3: tail = Q->Tail // Read Tail
D4: next = head.ptr->next // Read Head.ptr->next
D5: if head == Q->Head // Are head, tail, and next consistent?
D6: if head.ptr == tail.ptr // Is queue empty or Tail falling behind?
D7: if next.ptr == NULL // Is queue empty?
D8: return FALSE // Queue is empty, couldn't dequeue
D9: endif
// Tail is falling behind. Try to advance it
D10: CAS(&Q->Tail, tail, <next.ptr, tail.count+1>)
D11: else // No need to deal with Tail
// Read value before CAS
// Otherwise, another dequeue might free the next node
D12: *pvalue = next.ptr->value
// Try to swing Head to the next node
D13: if CAS(&Q->Head, head, <next.ptr, head.count+1>)
D14: break // Dequeue is done. Exit loop
D15: endif
D16: endif
D17: endif
D18: endloop
D19: free(head.ptr) // It is safe now to free the old node
D20: return TRUE // Queue was not empty, dequeue succeeded
In my view, the race is like this:
- Thread 1 advanced to D3 and then stop.
- Thread 2 advanced to D3, read the same head as Thread 1.
- Thread 2 luckily advanced all the way to D20, at D19 it freed head.ptr
- Thread 1 continues and advanced to D4, trying to read
head.ptr->next
, but ashead.ptr
is already freed by Thread 1, crash happens.
And my C++ code really always crashes at D4 for Thread 1.
Can anyone please point out my mistake and give some explanation ?