Why is my thread blocked by a critical section not being held by anything?
Asked Answered
C

1

7

I am having an issue with a critical section in C++. I'm getting a hung window and when I dump the process I can see the thread waiting on a critical section:

  16  Id: b10.b88 Suspend: 1 Teb: 7ffae000 Unfrozen
ChildEBP RetAddr  
0470f158 7c90df3c ntdll!KiFastSystemCallRet
0470f15c 7c91b22b ntdll!NtWaitForSingleObject+0xc
0470f1e4 7c901046 ntdll!RtlpWaitForCriticalSection+0x132
0470f1ec 0415647e ntdll!RtlEnterCriticalSection+0x46

The line data, etc, all indicates entry into a specific critical section. The only problem is that no other threads appear to be holding this critical section open. There's nothing indicated by Windbg's !locks command and dumping the critical section indicates it's not locked as can be seen by the null owner and the -1 LockCount in the structure below.

0:016> dt _RTL_CRITICAL_SECTION 42c2318
_RTL_CRITICAL_SECTION
   +0x000 DebugInfo        : 0x02c8b318 _RTL_CRITICAL_SECTION_DEBUG
   +0x004 LockCount        : -1
   +0x008 RecursionCount   : -1
   +0x00c OwningThread     : (null) 
   +0x010 LockSemaphore    : 0x00000340 
   +0x014 SpinCount        : 0

0:016> dt _RTL_CRITICAL_SECTION_DEBUG 2c8b318
_RTL_CRITICAL_SECTION_DEBUG
   +0x000 Type             : 0
   +0x002 CreatorBackTraceIndex : 0x2911
   +0x004 CriticalSection  : 0x042c2318 _RTL_CRITICAL_SECTION
   +0x008 ProcessLocksList : _LIST_ENTRY [ 0x2c8b358 - 0x2c8b2e8 ]
   +0x010 EntryCount       : 1
   +0x014 ContentionCount  : 1
   +0x018 Flags            : 0xbaadf00d
   +0x01c CreatorBackTraceIndexHigh : 0xf00d
   +0x01e SpareWORD        : 0xbaad

How is this possible? Even in a deadlock where another thread has not called LeaveCriticalSection I would expect to see the critical section itself marked as locked. Does anyone have any debugging suggestions or possible fixes?

Corrie answered 12/1, 2012 at 4:9 Comment(4)
One thing I would check is whether I have done a single EnterCriticalSection followed by 2 LeaveCriticalSections.Blackfoot
Check that the critical section has not been deleted. From DeleteCriticalSection: If a critical section is deleted while it is still owned, the state of the threads waiting for ownership of the deleted critical section is undefined.Prolegomenon
@Prolegomenon probably is right 0xbaadf00d mead that deallocation was performed.Builtin
That doesn't appear to be the cause unfortunately. The only place DeleteCriticalSection is called is in the destructor for the object that has the critical section as a member variable. I added logging into the destructor just to make sure and it confirmed the destructor wasn't being unexpectedly called prior to this.Corrie
C
8

It turned out to be a bug where LeaveCriticalSection was being called without a corresponding EnterCriticalSection. This caused the critical section to decrement LockCount and RecursionCount into the following state (the default for LockCount is -1 and RecursionCount is 0):

0:016> dt _RTL_CRITICAL_SECTION 1092318
_RTL_CRITICAL_SECTION
    +0x000 DebugInfo        : 0x....... _RTL_CRITICAL_SECTION_DEBUG
    +0x004 LockCount        : -2
    +0x008 RecursionCount   : -1
    +0x00c OwningThread     : (null)
    +0x010 LockSemaphore    : 0x....... 
    +0x014 SpinCount        : 0 

When the subsequent EnterCriticalSection was performed, it hung because RecursionCount was non-zero - a thread can only take ownership of the critical section if RecursionCount is 0. However it did increment LockCount (taking it back to the -1 seen in my original question) just to confuse matters.

In summary if you see a critical section halting your thread with both LockCount and RecursionCount of -1, it means there was excessive unlocking.

As to the code causing it:

if (SysStringLen(bstrState) > 0)
    CHECKHR_CS( m_pStateManager->SetState(bstrState), &m_csStateManagerLock );

And the definition of the error-checking macro:

#define CHECKHR_CS(x, cs)                       \
    EnterCriticalSection(cs);                       \
    if( FAILED(hr = (x)) ) {                        \
        LeaveCriticalSection(cs);                   \
        return hr;                          \
    }                           \
    LeaveCriticalSection(cs);

The macro lacks curly braces around its contents, so the if statement not being satisfied only skips EnterCriticalSection. Obviously a problem.

Corrie answered 13/1, 2012 at 4:55 Comment(2)
Consider using the RAII idiom and wrap EnterCriticalSection() in a constructor and LeaveCriticalSection() in a destructor. That way you won't forget to unlock the mutex (or unlock it twice). This is how Boost's lock_guard works.Goltz
Good point. It would've thrown a compiler error for an undeclared variable if that was used here.Corrie

© 2022 - 2024 — McMap. All rights reserved.