I've got a single writer which has to increment a variable at a fairly high frequence and also one or more readers who access this variable on a lower frequency.
The write is triggered by an external interrupt.
Since i need to write with high speed i don't want to use mutexes or other expensive locking mechanisms.
The approach i came up with was copying the value after writing to it. The reader now can compare the original with the copy. If they are equal, the variable's content is valid.
Here my implementation in C++
template<typename T>
class SafeValue
{
private:
volatile T _value;
volatile T _valueCheck;
public:
void setValue(T newValue)
{
_value = newValue;
_valueCheck = _value;
}
T getValue()
{
volatile T value;
volatile T valueCheck;
do
{
valueCheck = _valueCheck;
value = _value;
} while(value != valueCheck);
return value;
}
}
The idea behind this is to detect data races while reading and retry if they happen. However, i don't know if this will always work. I haven't found anything about this aproach online, therefore my question:
Is there any problem with my aproach when used with a single writer and multiple readers?
I already know that high writing frequencys may cause starvation of the reader. Are there more bad effects i have to be cautious of? Could it even be that this isn't threadsafe at all?
Edit 1:
My target system is a ARM Cortex-A15.
T
should be able to become at least any primitive integral type.
Edit 2:
std::atomic
is too slow on reader and writer site. I benchmarked it on my system. Writes are roughly 30 times slower, reads roughly 50 times compared to unprotected, primitive operations.
std::atomic
is that internal system calls are greatly slowing it down. What internal system calls? Atomic operations should boil down to special instructions only if atomics are supoorted by hardware for a given data type (is it your case or not?). What do you mean by slowing it down? Slowing producer? Slowing consumers? How do you measure such slowing? – Bearablestd::atomic<T>
(forint
type, ARM, and GCC 8.2), the latter one is much simpler and should be more efficient: godbolt.org/z/lpvFQB. BTW, how do you measure the slowdown? – Bearablevolatile
in your soluton byatomic
s: godbolt.org/z/tl4wdB. The generated assembly is very similar and I doubt there would be any such significant performance difference (note that there is no reason for those local variables ingetValue
to bevolatile
). – Bearablestd::atomic
with what ordering constraint? The default sequential consistency? Did you profile with release/acquire or release/consume ordering? Did you look at the generated assembly for those cases? – Darcee