Determining the COM object blocking finalizer
Asked Answered
A

0

6

We have a .NET background processing application (console app) that executes some arbitrary workloads (basically internal users supply .NET DLLs implementing "Execute" interface). Background processing application then loads the dll via reflection and executes it.

One of the supplied DLLs apparently has a COM object in it which is not properly disposed (likely).

As the processing time is quite long, AND we have COM objects created on the main thread and not disposed properly, this causes a Finalizer Thread to be blocked, which subsequently causes the process to accumulate a significant amount of unreleased handles (it can go up to hundreds of thousands)

I have tried debugging the dump of that process using WinDbg, and I clearly see that the finalizer thread is blocked on the main thread in COM call.

First we look at unmanaged stack in the finalizer thread (I assume in attempt to communicate to the other thread to finalize the COM object):

0:002> kb
 # ChildEBP RetAddr  Args to Child              
00 0372eec0 74a92cc7 00000968 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
01 0372ef34 74a92c02 00000968 ffffffff 00000000 KERNELBASE!WaitForSingleObjectEx+0x99
02 0372ef48 74dd9839 00000968 ffffffff 010846e4 KERNELBASE!WaitForSingleObject+0x12
03 (Inline) -------- -------- -------- -------- combase!MTAThreadWaitForCall+0x43 [d:\blue\com\combase\dcomrem\channelb.cxx @ 5657]
04 0372ef78 74e7d524 01083ef0 0111a198 0372f1e4 combase!MTAThreadDispatchCrossApartmentCall+0x1ed [d:\blue\com\combase\dcomrem\chancont.cxx @ 193]
05 (Inline) -------- -------- -------- -------- combase!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x33a2 [d:\blue\com\combase\dcomrem\channelb.cxx @ 5052]
06 0372f0cc 74d5caea 010846e4 0372f1e4 0372f1ac combase!CRpcChannelBuffer::SendReceive2+0x62d [d:\blue\com\combase\dcomrem\channelb.cxx @ 4796]
07 (Inline) -------- -------- -------- -------- combase!ClientCallRetryContext::SendReceiveWithRetry+0x2e [d:\blue\com\combase\dcomrem\callctrl.cxx @ 1090]
08 (Inline) -------- -------- -------- -------- combase!CAptRpcChnl::SendReceiveInRetryContext+0x81 [d:\blue\com\combase\dcomrem\callctrl.cxx @ 715]
09 0372f12c 74da1789 010846e4 0372f1e4 0372f1ac combase!DefaultSendReceive+0x9e [d:\blue\com\combase\dcomrem\callctrl.cxx @ 671]
0a (Inline) -------- -------- -------- -------- combase!CAptRpcChnl::SendReceive+0x38 [d:\blue\com\combase\dcomrem\callctrl.cxx @ 752]
0b 0372f190 74e7a010 010846e4 0372f1e4 0372f1ac combase!CCtxComChnl::SendReceive+0x248 [d:\blue\com\combase\dcomrem\ctxchnl.cxx @ 735]
0c 0372f1b4 749f5769 0732ade4 0372f210 749f5740 combase!NdrExtpProxySendReceive+0x5c [d:\blue\com\combase\ndr\ndrole\proxy.cxx @ 2017]
0d 0372f1cc 74a76c1b 3037cf71 0732ade4 0372f630 rpcrt4!NdrpProxySendReceive+0x29
0e 0372f5f8 74e79e1e 74d4d8f8 74d52ff0 0372f630 rpcrt4!NdrClientCall2+0x22b
0f 0372f618 74d5c46f 0372f630 00000008 0372f6b8 combase!ObjectStublessClient+0x6c [d:\blue\com\combase\ndr\ndrole\i386\stblsclt.cxx @ 215]
10 0372f628 74da1c07 0732ade4 0372f668 01075c10 combase!ObjectStubless+0xf [d:\blue\com\combase\ndr\ndrole\i386\stubless.asm @ 171]
11 0372f6b8 74da3024 01075c00 74296f1e 0372f77c combase!CObjectContext::InternalContextCallback+0x1e4 [d:\blue\com\combase\dcomrem\context.cxx @ 4428]
12 0372f70c 74296db9 01075c10 74296f1e 0372f77c combase!CObjectContext::ContextCallback+0xbc [d:\blue\com\combase\dcomrem\context.cxx @ 4332]
13 0372f804 74296e8a 74296771 0372f894 30ec4c53 clr!CtxEntry::EnterContext+0x243
14 0372f83c 74296ecb 74296771 0372f894 00000000 clr!RCW::EnterContext+0x3a
15 0372f858 74296ca8 01075cb8 30ec4c07 747f62e8 clr!RCWCleanupList::ReleaseRCWListInCorrectCtx+0xc0
16 0372f8b4 74296453 30ec4d6b 747f5660 00000000 clr!RCWCleanupList::CleanupAllWrappers+0x14d
17 0372f904 742966d8 0103d6d8 00000001 30ec4cb7 clr!SyncBlockCache::CleanupSyncBlocks+0xd0
18 0372f91c 74294a12 30ec4d47 0372fa38 0372fab0 clr!Thread::DoExtraWorkForFinalizer+0x7f
19 0372f964 741ad6b7 0372fab0 0104dbd0 0372fab0 clr!WKS::GCHeap::FinalizerThreadWorker+0x232
1a 0372f978 741ad721 30ec4e73 0372fab0 00000000 clr!Thread::DoExtraWorkForFinalizer+0x1bd
1b 0372fa1c 741ad7ee 30ec4e17 00000000 00000040 clr!Thread::DoExtraWorkForFinalizer+0x240
1c 0372fa78 742bac66 30ec4e9b 742bab00 00000000 clr!Thread::DoExtraWorkForFinalizer+0x5f0
1d 0372faf4 742a5ffd 00000000 00000000 00000000 clr!WKS::GCHeap::FinalizerThreadStart+0x1a0
1e 0372fb94 75127c04 01036258 75127be0 31b17300 clr!Thread::intermediateThreadProc+0x4d
1f 0372fba8 7723ab8f 01036258 33bfcebd 00000000 kernel32!BaseThreadInitThunk+0x24
20 0372fbf0 7723ab5a ffffffff 7721ffcf 00000000 ntdll!__RtlUserThreadStart+0x2f
21 0372fc00 00000000 742a5fb4 01036258 00000000 ntdll!_RtlUserThreadStart+0x1b

We now try to trace the call (01083ef0):

0:002> dc 010846e4
010846e4  74d4d400 74d61e2c 74d5a65c adb1682c  ...t,..t\..t,h..
010846f4  429b3908 2b59b182 e6a936e1 00000003  .9.B..Y+.6......
01084704  0000002a 04931d40 00000000 01083ef0  *...@........>..
01084714  010849d8 0732cbc0 010910c0 74d4b2e0  .I....2........t
01084724  00070005 00000003 0009b798 00000000  ................
01084734  00000000 0108478c f1eef1ee 00650054  .....G......T.e.
01084744  00680063 00470020 006f006c 00610062  c.h. .G.l.o.b.a.
01084754  002f006c 00610043 00670072 0057006f  l./.C.a.r.g.o.W.

Now we can determine the process id (same as we are within own process) and thread id of 0009b798:

0:002> dc 01083ef0
01083ef0  01083f88 74e8ddd0 00008b8c 0009b798  .?.....t........
01083f00  c502aa57 72322f52 5c326394 bf12fc0f  W...R/2r.c2\....
01083f10  5c326394 bf12fc0f 00007400 ffff8b8c  .c2\.....t......
01083f20  7579b889 1e59d93e 04000203 01f31fa0  ..yu>.Y.........
01083f30  010794d0 00000000 00000000 00000000  ................
01083f40  00000001 ffffffff 0102c3b0 0732ade4  ..............2.
01083f50  00000004 00000000 00000000 00000000  ................
01083f60  0067006e 00070005 00000000 00000000  n.g.............

Looking at threads, voila, finalizer is calling into the main thread (0009b798)

0:002> !threads
Index   TID     TEB     StackBase   StackLimit  DeAlloc StackSize   ThreadProc
0   0009b798    0xff03e000  0x00cf0000  0x00ce5000  0x00bf0000  0x0000b000  0x00000000: 
1   000c3e34    0xff03b000  0x015f0000  0x015ef000  0x014f0000  0x00001000  0x00000000: 
2   0005c870    0xff035000  0x03730000  0x0372e000  0x03630000  0x00002000  0x00000000: 
3   0006518c    0xfef06000  0x04a30000  0x04a2e000  0x049f0000  0x00002000  0x00000000: 
4   0006befc    0xfef03000  0x04bf0000  0x04bee000  0x04bb0000  0x00002000  0x00000000: 
5   00080128    0xfef00000  0x04d30000  0x04d2d000  0x04c30000  0x00003000  0x00000000: 
6   000b3ce0    0xfeef4000  0x06310000  0x0630e000  0x06210000  0x00002000  0x7036df4d: System_Data!SNIAsyncWait
7   00056c18    0xfee8f000  0x075a0000  0x0759f000  0x074a0000  0x00001000  0x00000000: 
8   00054374    0xfeef7000  0x0f690000  0x0f68f000  0x0f590000  0x00001000  0x00000000: 
9   0002c1f0    0xfee3e000  0x0c9f0000  0x0c9ed000  0x0c8f0000  0x00003000  0x00000000: 
10  0009ddb8    0xfee80000  0x07ac0000  0x07abe000  0x079c0000  0x00002000  0x00000000: 

In my WinDbg session I can switch between stack frames on the Finalizer callstack, here's an example observing SendReceive2 (although I can do that I don't know where or what to look for):

0:002> dx Debugger.Sessions[0].Processes[35724].Threads[378992].Stack.Frames[6].SwitchTo();dv /t /v
Debugger.Sessions[0].Processes[35724].Threads[378992].Stack.Frames[6].SwitchTo()
@edi              class CRpcChannelBuffer * this = 0x010846e4
0372effc          struct tagRPCOLEMESSAGE * pMessage = 0x0372f1e4
0372f004          unsigned long * pstatus = 0x0372f1ac
0372efb4          struct tagRPC_ERROR_ENUM_HANDLE EnumHandle = struct tagRPC_ERROR_ENUM_HANDLE
0372f008          int fLogEventIsActive = 0n0
<unavailable>     bool bSuccessBeforeClientNotify = <value unavailable>
0372f0a4          struct _GUID iid = {00000134-0000-0000-C000-000000000046}
<unavailable>     HRESULT result = <value unavailable>
@ebx              class CMessageCall * pCall = 0x0111a198
0372f0b4          unsigned long [5] RpcValues = unsigned long [5]
<unavailable>     bool fInstrumentPrematureStubRundown = <value unavailable>
0372f014          struct tagRPC_EXTENDED_ERROR_INFO ErrorInfo = struct tagRPC_EXTENDED_ERROR_INFO
<unavailable>     unsigned long cMax = <value unavailable>
<unavailable>     union WireThat * pOutb = <value unavailable>
0372efcc          int bEnumRpcErrorInfo = 0n0
0372efdc          HRESULT resultSave = 0x00000000
<unavailable>     struct __MIDL_XmitDefs_0003 * pLocalThat = <value unavailable>
<unavailable>     char * stub_data = <value unavailable>
0372efd4          int bSaveResult = 0n0
<unavailable>     long status = <value unavailable>
<unavailable>     struct ISystemDebugEventFire * g_pEventFire = <value unavailable>
<unavailable>     union WPP_PROJECT_CONTROL_BLOCK * WPP_GLOBAL_Control = <value unavailable>

I have two questions:

First one:

I would like to somehow determine which COM object is causing this problem, so that I can advise our internal users to fix the problem in their code (if it's not in utility DLL to which we don;t have the source). I tried walking around some "combase" methods present in the finalizer stack, however I am not really familiar with it so I don't know which method might contain the CLSID of the target COM object. Any advise from community is appreciated.

Second one:

Given that our background processing application may suffer from such malpractices, is there a way I can change it so that even if the main thread executes a long-running operation (during which and on which the dreaded COM objects are created), we can still do the finalization of those COM objects (and other objects as well as necessary).

P.S. I have an open WinDbg on this one so can execute any command necessary.

Update1:

Based on suggestions I have tracked down RCW, however it is unclear for me how to proceed further:

0:002> !dumpheap -type System.__ComObject
 Address       MT     Size
01a7f348 73386b9c       16     

Statistics:
      MT    Count    TotalSize Class Name
73386b9c        1           16 System.__ComObject
Total 1 objects
0:002> !dumprcw 01a7f348
Managed object:             00000000
Creating thread:            01a7f4b4
IUnknown pointer:           01a7f480
COM Context:                00000000
Managed ref count:          0
IUnknown V-table pointer :  01a7f464 (captured at RCW creation time)
Flags:                      IsDisconnected 
COM interface pointers:
      IP  Context       MT Type
72f6347c 00000000 08000011 System.__ComObject
7326fa50 00000000 73311c2c System.__ComObject
7337a44c 00000000 00010000 System.__ComObject

Also:

0:002> $$>a<C:\IgorM\HandleLeak\COM_FROM_RCW_PUBLIC.txt 01a7f348

 Make sure you are using Symbols and PSSCOR or SOS is loaded…

Closing open log file C:\IgorM\HandleLeak\CommandLog_2_continue.log
Opened log file 'TEMP.LOG'
Index         SyncBlock MonitorHeld Recursion Owning Thread Info          SyncBlock Owner
   16 0110e240            0         0 00000000     none    01a7f348 System.__ComObject
-----------------------------
Total           743
CCW             0
RCW             1
ComClassFactory 0
Free            0
Closing open log file TEMP.LOG
SyncBlock address = 0110e240

This is the COM object referenced by the RCW object from address 0110e240:

741f6114  74252315 clr!AssemblySecurityDescriptor::`vector deleting destructor'
741f6118  741bf3f9 clr!SecurityDescriptorBase<IApplicationSecurityDescriptor>::IsFullyTrusted
741f611c  741d3b9f clr!SecurityDescriptorBase<IApplicationSecurityDescriptor>::CanCallUnmanagedCode
741f6120  742e258e clr!ZapVirtualSection::GetSize
741f6124  741f5b78 clr!SecurityDescriptorBase<IApplicationSecurityDescriptor>::GetDomain
741f6128  741d71b0 clr!AssemblySecurityDescriptor::Resolve
741f612c  741d71ac clr!SecurityDescriptorBase<IApplicationSecurityDescriptor>::IsResolved
741f6130  742f922a clr!AssemblySecurityDescriptor::GetEvidence
741f6134  742d6458 clr!fusion::logging::BindingScope::GetAssemblyName
741f6138  74317958 clr!SecurityDescriptorBase<IApplicationSecurityDescriptor>::SetEvidence
741f613c  741bfefc clr!SecurityDescriptorBase<IApplicationSecurityDescriptor>::GetGrantedPermissionSet
741f6140  741d71d4 clr!AssemblySecurityDescriptor::GetSharedSecDesc
741f6144  742059db clr!AssemblySecurityDescriptor::CanAssert
741f6148  74588b41 clr!AssemblySecurityDescriptor::HasUnrestrictedUIPermission
741f614c  74588b4a clr!AssemblySecurityDescriptor::IsAllCritical
741f6150  74588b6c clr!AssemblySecurityDescriptor::IsAllSafeCritical
Amitie answered 7/12, 2016 at 6:1 Comment(8)
My thoughts: a COM object is not used in .NET directly. It's used by an RCW. An RCW is a managed object, so you need to find that one first (!dso, !dumpheap, !clrstack may help). Once you have the RCW, use !dumpRCW to find the COM object. There's also SOSEX !finq and !rcwStaggers
I'll try that, thanks! :)Amitie
COM objects are backed by System.__ComObject, so you can dump them with !sos.dumpheap -type System.__ComObject. Then, as Thomas wrote, use !sos.dumprcw on the COM object addresses.Dividend
I've updated the answer with results of these commands.Amitie
"As the processing time is quite long" is enough of an explanation. The main thread of a program that creates COM objects must observe the STA promise it makes. Two simple rules it must follow: pump a message loop (Application.Run() in .NET) and it must never block. When "processing time is quite long" then it is blocking. Tends to have more side-effects, like a user interface is that not responsive anymore. Deadlock is a common hazard as well. You must rewrite that code so that the expensive code runs on a worker thread.Minimal
But won't we have the same situation if the "worker" thread that runs expensive code creates a COM object? The COM object will then won't be finalized as long as this thread is running it's operation, and the finalizer will be stuck for that process as a whole trying to call into that worker thread to finalize that COM object?Amitie
@IgorMalin The private PDB symbols for combase are available from the MS symbol server. So do a kp on the finalizer thread. Dump the tagComCallData parameter (dx -r1 (((combase!tagComCallData *)<tagComCallData address>))). Then dump the third offset pUserDefined (void) (dps <pUserDefined address>). The second offset is the RCW*. Use ? poi(<RCW* address>) to find the RCW. Then !sos.dumprcw <RCW address> for details. Use dps on the "IUnknown V-table pointer" to identify the COM object.Dividend
[Continued] Finally, dump the pOXIDEntry parameter to MTAThreadDispatchCrossApartmentCall. It will show process ID (_dwPid) and thread ID (_dwTid) identifying the target COM thread. This is probably a STA thread, but you can confirm with !sos.threads. A COM object identified above was created there and the finalizer thread is trying to clean it up.Dividend

© 2022 - 2024 — McMap. All rights reserved.