How does libcxx std::counting_semaphore implement "Strongly happens before" for release / acquire?
Asked Answered
C

1

14

libc++ std::counting_semaphore uses atomic increment with memory_order_release in release method:

    void release(ptrdiff_t __update = 1)
    {
        if(0 < __a.fetch_add(__update, memory_order_release))
            ;
        else if(__update > 1)
            __a.notify_all();
        else
            __a.notify_one();
    }

And compare exchange with memory_order_acquire successful memory order in acquire method:

    void acquire()
    {
        auto const __test_fn = [=]() -> bool {
            auto __old = __a.load(memory_order_relaxed);
            return (__old != 0) && __a.compare_exchange_strong(__old, __old - 1, memory_order_acquire, memory_order_relaxed);
        };
        __cxx_atomic_wait(&__a.__a_, __test_fn);
    }

Obvious choice to make acquire synchronized with release.


However, C++20 draft says:

void release(ptrdiff_t update = 1);

...

Synchronization: Strongly happens before invocations of try_­acquire that observe the result of the effects.

Strongly happens before is somewhat more than synchronizes with, C++20 draft says:

An evaluation A strongly happens before an evaluation D if, either

  • (12.1) A is sequenced before D, or
  • (12.2) A synchronizes with D, and both A and D are sequentially consistent atomic operations ([atomics.order]), or
  • (12.3) there are evaluations B and C such that A is sequenced before B, B simply happens before C, and C is sequenced before D, or
  • (12.4) there is an evaluation B such that A strongly happens before B, and B strongly happens before D.

I guess 12.2 would have been the best fit here, where A is fetch_add, D is compare_exchange_strong. But in addition to being synchronized with, they should have been seq_cst!

Trying 12.3 doesn't seem to help either. We call fetch_add B, and compare_exchange_strong C. Fine, but where are A and D then?

So how does it work (according to the C++20 Standard draft)?


The same applies to std::latch and std::barrier.

I picked one (std::semaphore) to refer easily specific paragraphs and lines.

Cascade answered 2/8, 2020 at 11:27 Comment(7)
You are looking at an implementation written for a specific compiler, g++. It doesn't have to be portable, it just has to work with g++. g++ may provide stronger guarantees than the standard.Relevant
I am looking at libc++, which has to work with Clang. Anyway, it still doesn't answer the question, unless the stronger guarantee is pointed at. (I think that the implementation is correct and generic, and does not need stronger guarantees; the problem is with my understamding, or (less likely) with Standard wording)Cascade
Oh sorry my bad. Apparently I can't read. Anyway it is implemented in terms of compare_exchange_strong which is in turn implemented with a gcc(-like) builtin. I don't know what guarantees it provides.Relevant
Actually I suspect that all the magic is due to notify_all/wait interaction.Relevant
If count never reaches zero, there's no wait/notify. Ditto for barrier, which might never lock: https://mcmap.net/q/902350/-why-do-separate-arrive-and-wait-exist-in-c-20-barrierCascade
Now I'm not even sure what the standard even means by "invocations of try_­acquire that observe the result of the effects". The counter was 5, release incremented it, try_acquire decremented it. It makes little sense to say that release happened or not happened before acquire, since the difference is unobservable. I give up and admit my cluelessness,Relevant
open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0668r5.html explains the reason behind the strong/simply change - it's to resolve an issue when you mix seq_cst and weaker ordering on the same location. Since this atomic is only accessed by the implementation, it is free to use the weaker ordering only under as-if.Stoplight
A
3

As referenced in P0668R5:

[...] Normally the strongly-happens-before guarantee will be used to order the users code before and after the calls, meaning that the requirement is implicitly satisfied by the addition of user code. [...]

This appears to be aimed at (12.3):

  • there are evaluations B and C such that A is sequenced before B, B simply happens before C, and C is sequenced before D, or

In other words, A and D represent user code, B corresponds to release(), and C signifies acquire(). Given that B simply happens before C (due to the release-acquire synchronization mentioned in your question), we establish that A strongly happens before D, which achieves the desired ordering. However, in line with this rationale, it appears that we should incorporate the simply-happens-before guarantee in the synchronization phrasing of release().

Alternatively, we could integrate acquire/release operations into the strongly-happens-before relation:

  • A synchronizes with D, A is a release atomic operation, and D is an acquire atomic operation ([atomics.order]), or

This approach seems more logical because the original purpose of strongly-happens-before is to resolve the ordering issue with mixed seq_cst and acquire/release operations. However, acquire/release operations alone are unaffected by this issue, as articulated in the paper:

[...] Code that consistently accesses each location only with memory_order_seq_cst, or only with weaker ordering, is not affected and functions correctly.

Arce answered 23/4 at 11:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.