Why is std::mutex twice as slow as CRITICAL_SECTION
Asked Answered
E

1

8

std::mutex is implemented with critical sections, which is why it's much faster than OS Mutex (on Windows). However it's not as fast as a Windows CRITICAL_SECTION.

Timings just a tight loop in a single thread:

423.76ns ATL CMutex
 41.74ns std::mutex
 16.61ns win32 Critical Section

My question is what else is std::mutex doing? I looked at the source but couldn't follow it. However there were extra steps before it defers to the Crit Sec. My questions is: are these extra steps accomplishing useful? That is, what are the extra steps for; what would I miss out on by using CRITICAL_SECTION?

Also why did they call it Mutex if it's not implemented with a Mutex?

Electroencephalograph answered 17/2, 2015 at 16:2 Comment(7)
What are you timing? Creating a mutex, locking it, unlocking it, ...? As far as differences from the Win32 mutex: a Win32 mutex is a cross-process mutex. The standard only calls for a cross-thread mutex, which can be a lighter weight construct (and Windows implements with a critical section).Finale
CMutex is implemented with a mutex, it isn't cheap. std::mutex is built on top of the Concurrency Runtime, it is quite a chunk of code with functionality that significantly extend beyond the threading and scheduling primitives provided by the OS. Layering is heavy, that doesn't come for free. If a critsect serves your purpose and the overhead actually matters then just punt the problem and use it.Fond
I was timing just lock/unlock. I'm just curious what std::mutex does beyond crit sec. If it's useful then shouldn't I want it? If it's not useful why does it do it? In think I'll use std::mutex but I'm just wondering what it does extra.Electroencephalograph
@Philip: Really the only advantages are: portability, and that it appears in the signature of other C++ threading functions.Cicatrix
Then I think "because it is cross-platform" might be the high level answer. Although I'm still curious what exactly the overhead is doing. I would guess "bookeeping"?Electroencephalograph
What compiler and library version do you use? std::mutex is cross-platform synchronization primitive and its performance depends on quality of implementation.Hyetal
It appears to be integrated into a broader set of functionality, e.g., the code path goes past options for timeouts, it seems to be implementing its own scheduler, has a lock queue, etc. Whether all of this provides any benefit if the simple mutex is all you're using is unclear to me.Closure
C
2

A std::mutex provides non-recursive ownership semantics. A CRITICAL_SECTION provides recursive semantics. So I assume the extra layer in the std::mutex implementation is (at least in part) to resolve this difference.

Update: Stepping through the code, it looks like std::mutex is implemented in terms of a queue and InterlockedX instructions rather than a classical Win32 CRITICAL_SECTION. Even though std::mutex is non-recursive, the underlying code in the RTL can optionally handle recursive and even timed locks.

Cellarer answered 17/2, 2015 at 19:2 Comment(4)
This sounds plausible except that std::recusive_mutex is not faster tha std::mutex in my tests. Which suggests they are both doing something to slow them down which is orthogonal to whether they are recursive or not, right?Electroencephalograph
Since it's UB to try to recursively lock a std::mutex, I don't see why code would be needed to "resolve the difference".Alainealair
Perhaps it's debug code to detect the UB? Was the timing done using release or debug?Cellarer
This is not a valid reason because a recursive mutex satisfies all the semantics of a regular mutex, so there is no extra code needed to resolve the difference.Peatroy

© 2022 - 2025 — McMap. All rights reserved.