I'm wondering if it is possible to create a lock-free, thread-safe shared pointer for any of the "common" architectures, like x64 or ARMv7 / ARMv8.
In a talk about lock-free programming at cppcon2014, Herb Sutter presented a (partial) implementation of a lock-free singly linked list. The implementation looks quite simple, but it relies on an atomic shared_ptr
implementation that doesn't exist in the standard library yet or on using the specialized std::atomic...
functions. This is especially important as single push/pop calls potentially invoke multiple atomic loads/stores and compare_exchange
operations.
The problem I see (and I think some of the questions in the talk went into the same direction) is that for this to be an actual lock-free data structure, those atomic operations would have to be lock-free themselves. I don't know of any standard library implementation for the std::atomic...
functions that is lock-free and - at least with a short google / SO search - I also didn't find a suggestion of how to implement a lock-free specialization for std::atomic<std::shared_ptr>
.
Now before I'm wasting my time on this I wanted to ask:
- Do you know, if it is possible to write an lockfree, atomic shared pointer at all?
- Are there already any implementations that I've overlooked and - ideally - are even compatible with what you would expect from a
std::atomic<std::shared_ptr>
? For the mentioned queue it would especially require aCAS
-operation. - If there is no way to implement this on current architectures, do you see any other benefit in Herb's implementation compared to a "normal" linked list that is protected by a lock?
For reference, here is the code from Herb Sutter (might contain typos from me):
template<class T>
class slist {
struct Node { T t; std::shared_ptr<Node> next; };
std::atomic<std::shared_ptr<Node>> head;
public:
class reference{
std::shared_ptr<Node> p;
public:
reference(std::shared_ptr<Node> p_){}
T& operator*(){ return p->t; }
T* operator->(){ return &p->t; }
};
auto find(T t) const {
auto p = head.load();
while (p && p-> != t) {
p = p - next;
}
return reference(move(p));
}
void push_front(T t) {
auto p = std::make_shared<Node>();
p->t = t;
p->next = head;
while (!head.compare_exchange_weak(p->next, p)) {}
}
void pop_front() {
auto p = head.load();
while (p && !head.compare_exchange_weak(p, p - next)) { ; }
}
};
Note, that in this implementation, single instances of a shared_ptr
can be accessed / modified by multiple different threads. It can be read/copied, reset and even deleted (as part of a node). So this not about whether multiple different shared_ptr
objects (that manage the same object) can be used by multiple threads without a race condition - this that is already true for current implementations and required by the standard - but it is about concurrent access to a single pointer instance, which is - for standard shared pointers - no more threadsafe than the same operations on raw pointers would be.
To explain my motivation:
This is mainly an academic question. I've no intention of implementing my own lock free list in production code, but I find the topic interesting and at first glance, Herb's presentation seemed to be a good introduction. However, while thinking about this question and @sehe's comment on my answer, I remembered this talk, had another look at it and realized that it doesn't make much sense to call Herb's implementation lock-free, if it's primitive operations require locks (which they currently do). So I was wondering, whether this is just a limitation of the current implementations or a fundamental flaw in the design.
shared_ptr
s via the [util.smartptr.shared.atomic] functions. They hash the address of theshared_ptr
object and use a fixed-size global hash table of mutexes. – Stuyvesant