I think this is worth presenting up front. This spin lock approach I'm about to present is valid and correct, but it is also scheduler-unaware, meaning that it won't cooperate with the scheduler like a standard mutex, but rather it will fight to function during the time slices it is allowed to run in (randomly from its perspective) by the scheduler. So, it has some tradeoffs.
Here are some comments I've written below this answer regarding this:
I read the whole article by Linus that you linked to. He's criticizing a very specific benchmark spin lock code which is attempting to characterize the Linux scheduler by using spin locks on a timer. Linus is right though: the OS scheduler benchmark routine being described is broken because there is an unsynchronized race condition between the scheduler and that spin lock. That is unrelated to what I'm doing in my spin lock above. I understand the scheduler limitations and potential conflicts...
...with my spin lock implementation. Furthermore, I'm not writing the low-level locking primitives from scratch, which is the part people will almost certainly get wrong. Rather, my C++ compiler (g++) is. That critical section of code is written in the <stdatomic.h>
header as atomic_test_and_set
and the type atomic_bool
which I use too. So, as long as the spin lock limitations are understood, what I have done above is a perfectly legitimate way to approach a problem like this. It should be benchmarked and compared to other approaches to know if it has merit over a standard C++ mutex...
...C mutex, or pthread mutex [in a given contect]. For some of the modern C++ mutexes, see my article on my personal website here: https://gabrielstaples.com/cpp-mutexes-and-locks. Additionally, my answer here isn't meant to present the best approach. It's meant to present a valid approach. The user needs to understand the limitations. It may be used in other contexts too, such as with the Linux soft-real-time SCHED_RR
round-robin scheduler, or in a kernel module, or elsewhere. There are certainly tradeoffs, but that doesn't mean alternative approaches like this should not be presented.
Now, if you want to see an interesting spin lock approach, here you go:
Quick summary: spin-lock-based atomic swap answer
This code snippet is what we will be building up to.
For a basic spin lock (high speed, no-context-switch mutex) approach on a multi-core system, which is where spin locks might be most useful (they require multi-core, not just multi-threaded, systems to be functional), you can do an atomic swap of variable values (pointers in this case) like this:
int *a = &some_int1; // ptr a
int *b = &some_int2; // ptr b
spinlock_t spinlock = SPINLOCK_UNLOCKED;
// swap pointers `a` and `b` (note: swapping the pointers themselves, not the
// integer values they point to)
atomic_swap_int_ptr(&spinlock, &a, &b);
Important clarification by @Peter Cordes:
...out of any systems where you might consider using spinlocks, they only make any sense on multi-core systems, and its better with pre-emptive multi-tasking.
Here are the details:
General-purpose spin lock approach in C11 (to wrap any "critical section" that needs to be atomic)
If your architecture supports lock-free atomic swaps using the array "pair" presented in the main answer here, then consider that. It may be lock-free and hence much more efficient than using a mutex or spin lock.
I wanted to delve into this a bit deeper though and try my hand at a spin lock. I have not speed profiled the various techniques (array "pair", mutex, and spin lock), but it would be very interesting and academic for me to do so to see how they compare in speed.
Anyway, as I was saying: you could also use a mutex.
Or, if you don't want your thread to go to sleep if the mutex isn't immediately available, you can improve efficiency by using a spin lock, since the number of times the spin lock has to spin is frequently so small that it is faster to spin than to put a thread to sleep and have a whole context switch.
C11 has a whole suite of atomic types and functions which allow you to easily implement your own spin lock to accomplish this.
Keep in mind, however, that a basic spin lock must not be used on a bare metal (no operating system) single core (processor) system, as it will result in deadlock if the main context takes the lock and then while the lock is still taken, an ISR fires and tries to take the lock. The ISR will block indefinitely.
To avoid this, more-complicated locking mechanisms must be used, such as a spin lock with a timeout, an ISR which auto-reschedules itself for a slightly later time if the lock is taken, automatic yielding, etc.
So, spin locks are best reserved for multi-core systems with an OS and scheduler.
Basic spin lock implementation in C11 using _Atomic
types
#include <stdbool.h>
#include <stdatomic.h>
#define SPINLOCK_UNLOCKED 0
#define SPINLOCK_LOCKED 1
typedef volatile atomic_bool spinlock_t;
// This is how you'd create a new spinlock to use in the functions below.
// spinlock_t spinlock = SPINLOCK_UNLOCKED;
/// \brief Set the spin lock to `SPINLOCK_LOCKED`, and return the previous
/// state of the `spinlock` object.
/// \details If the previous state was `SPINLOCK_LOCKED`, you can know that
/// the lock was already taken, and you do NOT now own the lock. If the
/// previous state was `SPINLOCK_UNLOCKED`, however, then you have now locked
/// the lock and own the lock.
bool atomic_test_and_set(spinlock_t* spinlock)
{
bool spinlock_val_old = atomic_exchange(spinlock, SPINLOCK_LOCKED);
return spinlock_val_old;
}
/// Spin (loop) until you take the spin lock.
void spinlock_lock(spinlock_t* spinlock)
{
// Spin lock: loop forever until we get the lock; we know the lock was
// successfully obtained after exiting this while loop because the
// atomic_test_and_set() function locks the lock and returns the previous
// lock value. If the previous lock value was SPINLOCK_LOCKED then the lock
// was **already** locked by another thread or process. Once the previous
// lock value was SPINLOCK_UNLOCKED, however, then it indicates the lock
// was **not** locked before we locked it, but now it **is** locked because
// we locked it, indicating we own the lock.
while (atomic_test_and_set(spinlock) == SPINLOCK_LOCKED) {};
}
/// Release the spin lock.
void spinlock_unlock(spinlock_t* spinlock)
{
*spinlock = SPINLOCK_UNLOCKED;
}
Now, you can use the spin lock, for your example, in your multi-core system, like this:
/// Atomically swap two pointers to int (`int *`).
/// NB: this does NOT swap their contents, meaning: the integers they point to.
/// Rather, it swaps the pointers (addresses) themselves.
void atomic_swap_int_ptr(spinlock_t * spinlock, int ** ptr1, int ** ptr2)
{
spinlock_lock(spinlock)
// critical, atomic section start
int * temp = *ptr1;
*ptr1 = *ptr2;
*ptr2 = temp;
// critical, atomic section end
spinlock_unlock(spinlock)
}
// Demo to perform an address swap
int *a = &some_int1;
int *b = &some_int2;
spinlock_t spinlock = SPINLOCK_UNLOCKED;
// swap pointers `a` and `b` (note: swapping the pointers themselves, not the
// integer values they point to)
atomic_swap_int_ptr(&spinlock, &a, &b); // <=========== final answer
References
- https://en.wikipedia.org/wiki/Test-and-set#Pseudo-C_implementation_of_a_spin_lock
- ***** C11 Concurrency support library and
atomic_bool
and other types: https://en.cppreference.com/w/c/thread
- C11
atomic_exchange()
function: https://en.cppreference.com/w/c/atomic/atomic_exchange
Other links
- Note that you could use C11's
struct atomic_flag
as well, and its corresponding atomic_flag_test_and_set()
function, instead of creating your own atomic_test_and_set()
function using atomic_exchange()
, like I did above.
_Atomic
types in C11: https://en.cppreference.com/w/c/language/atomic
- C++11 Implementation of Spinlock using header
<atomic>
TODO (newest on top)
- [ ] 20230418: Upgrade this implementation and speed profile it with the following changes. See C concurrency documentation here: https://en.cppreference.com/w/c/thread:
- [ ] Use
atomic_flag
instead of atomic_bool
: https://en.cppreference.com/w/c/atomic/atomic_flag (emphasis added):
atomic_flag
is an atomic boolean type. Unlike other atomic types, it is guaranteed to be lock-free. Unlike atomic_bool
, atomic_flag
does not provide load or store operations.
- Be sure to initialize it with
ATOMIC_FLAG_INIT
.
- Also, use
atomic_flag_test_and_set()
instead of my custom atomic_test_and_set()
above, and use atomic_flag_clear()
to clear or "unset" the flag.
- [ ] 20220922: This implementation would need to be modified to add safety anti-deadlock mechanisms such as automatic deferral, timeout, and retry, to prevent deadlock on single core and bare-metal systems. See my notes here: Add basic mutex (lock) and spin lock implementations in C11, C++11, AVR, and STM32 and here: What are the various ways to disable and re-enable interrupts in STM32 microcontrollers in order to implement atomic access guards?
atomic_int
would have to be non-lock-free on almost all architectures. (Because it can't be synthesized out of other things without locking.) – Orphanage