If you're on Windows, you won't be able to use futexes, but Windows Vista has a similar mechanism called Keyed Events. Unfortunately, this isn't part of the published API (it's an NTDLL native API), but you can use it as long as you accept the caveat that it might change in future versions of Windows (and you don't need to run on pre-Vista kernels). Be sure to read the article I linked above. Here's an untested sketch of how it might work:
/* Interlocked SList queue using keyed event signaling */
struct queue {
SLIST_HEADER slist;
// Note: Multiple queues can (and should) share a keyed event handle
HANDLE keyed_event;
// Initial value: 0
// Prior to blocking, the queue_pop function increments this to 1, then
// rechecks the queue. If it finds an item, it attempts to compxchg back to
// 0; if this fails, then it's racing with a push, and has to block
LONG block_flag;
};
void init_queue(queue *qPtr) {
NtCreateKeyedEvent(&qPtr->keyed_event, -1, NULL, 0);
InitializeSListHead(&qPtr->slist);
qPtr->blocking = 0;
}
void queue_push(queue *qPtr, SLIST_ENTRY *entry) {
InterlockedPushEntrySList(&qPtr->slist, entry);
// Transition block flag 1 -> 0. If this succeeds (block flag was 1), we
// have committed to a keyed-event handshake
LONG oldv = InterlockedCompareExchange(&qPtr->block_flag, 0, 1);
if (oldv) {
NtReleaseKeyedEvent(qPtr->keyed_event, (PVOID)qPtr, FALSE, NULL);
}
}
SLIST_ENTRY *queue_pop(queue *qPtr) {
SLIST_ENTRY *entry = InterlockedPopEntrySList(&qPtr->slist);
if (entry)
return entry; // fast path
// Transition block flag 0 -> 1. We must recheck the queue after this point
// in case we race with queue_push; however since ReleaseKeyedEvent
// blocks until it is matched up with a wait, we must perform the wait if
// queue_push sees us
LONG oldv = InterlockedCompareExchange(&qPtr->block_flag, 1, 0);
assert(oldv == 0);
entry = InterlockedPopEntrySList(&qPtr->slist);
if (entry) {
// Try to abort
oldv = InterlockedCompareExchange(&qPtr->block_flag, 0, 1);
if (oldv == 1)
return entry; // nobody saw us, we can just exit with the value
}
// Either we don't have an entry, or we are forced to wait because
// queue_push saw our block flag. So do the wait
NtWaitForKeyedEvent(qPtr->keyed_event, (PVOID)qPtr, FALSE, NULL);
// block_flag has been reset by queue_push
if (!entry)
entry = InterlockedPopEntrySList(&qPtr->slist);
assert(entry);
return entry;
}
You could also use a similar protocol using Slim Read Write locks and Condition Variables, with a lockless fast path. These are wrappers over keyed events, so they may incur more overhead than using keyed events directly.