I'm having trouble finding the cause for a hang in a Win32 application. The software renders some data to an OpenGL visual in a tight loop:
std::vector<uint8_t> indices;
glPolygonMode(GL_FRONT_AND_BACK, GL_FILL);
glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(2, GL_DOUBLE, 0, vertexDataBuffer);
while (...) {
// get index type (1, 2, 4) and index count
indices.resize(indexType * count);
// get indices into "indices" buffer
getIndices(indices.data(), indices.size()); //< seems to hang here!
// draw (I'm using the correct parameters)
glDrawElements(GL_TRIANGLES_*, count, GL_UNSIGNED_*);
}
glDisableClientState(GL_VERTEX_ARRAY);
The code is compiled using VC11 Update 1 (CTP 3). When running the optimized binary, it hangs inside the call to getIndices()
(more about this below) after a few of those loops. I already have...
- triple validated all buffers, even appended CRCs to make sure I'm not having any buffer overruns
- Added a call to HeapValidate() inside the loop to ensure the heap is not corrupt
- used ApplicationVerifier
- Enabled heap allocation monitoring using GFlags and PageHeap.
- broke into WinDbg when the application locks up
I did not find any problems with the code accessing the allocated buffer, nor any heap corruption. However, if I disable the low-fragmentation heap, the issue vanishes. It also vanishes, if I use a separate (low-fragmentation) heap for the indices
buffer.
Anyway, here is the stack trace leading to the dead-lock:
0:000> kb
ChildEBP RetAddr Args to Child
0034e328 77b039c3 00000000 0034e350 00000000 ntdll!ZwWaitForKeyedEvent+0x15
0034e394 77b062bc 77b94724 080d36a8 0034e464 ntdll!RtlAcquireSRWLockExclusive+0x12e
0034e3c0 77aeb652 0034e464 0034e4b4 00000000 ntdll!RtlpCallVectoredHandlers+0x58
0034e3d4 77aeb314 0034e464 0034e4b4 77b94724 ntdll!RtlCallVectoredExceptionHandlers+0x12
0034e44c 77aa0133 0034e464 0034e4b4 0034e464 ntdll!RtlDispatchException+0x19
0034e44c 77b062c5 0034e464 0034e4b4 0034e464 ntdll!KiUserExceptionDispatcher+0xf
0034e7bc 77aeb652 0034e860 0034e8b0 00000000 ntdll!RtlpCallVectoredHandlers+0x61
0034e7d0 77aeb314 0034e860 0034e8b0 0034ec28 ntdll!RtlCallVectoredExceptionHandlers+0x12
0034e848 77aa0133 0034e860 0034e8b0 0034e860 ntdll!RtlDispatchException+0x19
0034e848 1c43c666 0034e860 0034e8b0 0034e860 ntdll!KiUserExceptionDispatcher+0xf
0034ebe8 1c43c4e5 0034ec28 080d35d0 080d35d6 lcdb4!lc::db::PackedIndices::unpackIndices<unsigned char>+0x86
0034ec14 1c45922d 0034ec28 080d35d0 00000006 lcdb4!lc::db::PackedIndices::unpack+0xb5
...
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx getIndices
For completeness, I posted the code of lc::db::PackedIndices::unpackIndices()
, including all code added for debugging, to http://ideone.com/sVVXX7.
The code triggering the call to KiUserExceptionDispatcher
is (*p++) = static_cast<T>(index);
(mov dword ptr [esp+10h],eax
).
I just can't seem to figure out what's going on. An exception seems to have been thrown, but none of my exception handlers are called. The application just hangs. I checked for any deadlocked critical sections (!lock
) but found none. Furthermore, I don't see why an exception should be raised, as the memory locations are all valid. Could anyone give me some hints?
Update
I tried to find the type of exception being thrown:
0:000> s -d esp L1000 1003f
0028ebdc 0001003f 00000000 00000000 00000000 ?...............
0028efd8 0001003f 00000000 00000000 00000000 ?...............
0:000> .cxr 0028ebdc
eax=77b94724 ebx=0804be30 ecx=00000002 edx=00000004 esi=77b94724 edi=0804be28
eip=77b062c5 esp=0028eec4 ebp=0028eee4 iopl=0 nv up ei ng nz na pe cy
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010287
ntdll!RtlpCallVectoredHandlers+0x61:
77b062c5 ff03 inc dword ptr [ebx] ds:002b:0804be30=00000001
0:000> .cxr 0028efd8
eax=0000003b ebx=00000001 ecx=0804bd98 edx=0028f340 esi=0028f340 edi=04b77580
eip=1c43c296 esp=0028f2c0 ebp=0028f2fc iopl=0 nv up ei pl nz na po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010202
lcdb4!lc::db::PackedIndices::unpackIndices<unsigned char>+0x36:
1c43c296 8801 mov byte ptr [ecx],al ds:002b:0804bd98=3e
getIndices
? – WhimgetIndices()
is just a tiny wrapper, eventually callinglc::db::PackedIndices::unpack
. – Garincount
being zero? Your code would break if that were true. – WhimunpackIndices()
in the failure case. Also, doesp
point to the right place at the time of failure? – Tradescantiacnt
is never 0 (they code path would not be entered) – Garin.load sosex.dll
,!rwlock
=>Unable to initialize .NET data interface. The CLR has not yet been loaded in the process...
. But this is a native app. – GarinPackedIndices::unpack
(and maybe several layers above it -- as many as needed to see where theindices
parameter comes from). – Tradescantiaindices
is the first parameter passed togetIndices()
in the code I showed in my question above. There is no code in between that would alter that pointer. – GaringlDrawElements
to still access the memory while I'm modifying it, but I'm not sure. – Garinp
was pointing at the moment of exception. The easiest way is probably to copy it to a global variable before each(*p++) = ...
(which shouldn't distort timings). – Tradescantia.exr -1
yields:ExceptionAddress: 77aa000c (ntdll!DbgBreakPoint)
,ExceptionCode: 80000003 (Break instruction exception)
. So the exception is just me breaking into the deadlocked application? – GarinAttempt to write to address 06a28fd0
ExceptionAddress: 1c43c296 (lcdb4!lc::db::PackedIndices::unpackIndices<unsigned char>+0x00000036)
ExceptionCode: c0000005 (Access violation)
. – GarinSetUnhandledExceptionFilter(uef);
, whereuef
contains{DebugBreak(); return EXCEPTION_CONTINUE_SEARCH;}
. – Garin07751e90
. So I search for!heap -p -a 07751e90
, and get:address 07751e90 found in _HEAP @ 7a0000
,HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
,07751e88 0003 0000 [00] 07751e90 0000c - (busy)
. So that pointer is definitely valid. – GarinProtect: 00000002 PAGE_READONLY
. But it's in the heap, and no, I never modified the page attributes. – Garin