RCW Finalizer Access Violation
Asked Answered
M

2

7

I am using COM interop for creating a managed plugin into an unmanaged application using VS2012/.NET 4.5/Win8.1. All the interop stuff seems to be going ok, but when I close the app I get an MDA exception telling me AV's have happened while Releasing COM objects the RCW's were holding onto during Finalizing.

This is the call stack:

clr.dll!MdaReportAvOnComRelease::ReportHandledException()  + 0x91 bytes 
clr.dll!**SafeRelease_OnException**()  + 0x55 bytes 
clr.dll!SafeReleasePreemp()  + 0x312d5f bytes   
clr.dll!RCW::ReleaseAllInterfaces()  + 0xf3 bytes   
clr.dll!RCW::ReleaseAllInterfacesCallBack()  + 0x4f bytes   
clr.dll!RCW::Cleanup()  + 0x24 bytes    
clr.dll!RCWCleanupList::ReleaseRCWListRaw()  + 0x16 bytes   
clr.dll!RCWCleanupList::ReleaseRCWListInCorrectCtx()  + 0x9c bytes  
clr.dll!RCWCleanupList::CleanupAllWrappers()  + 0x2cd1b6 bytes  
clr.dll!RCWCache::ReleaseWrappersWorker()  + 0x277 bytes    
clr.dll!AppDomain::ReleaseRCWs()  + 0x120cb2 bytes  
clr.dll!ReleaseRCWsInCaches()  + 0x3f bytes 
clr.dll!InnerCoEEShutDownCOM()  + 0x46 bytes    
clr.dll!WKS::GCHeap::**FinalizerThreadStart**()  + 0x229 bytes  
clr.dll!Thread::intermediateThreadProc()  + 0x76 bytes  
kernel32.dll!BaseThreadInitThunk()  + 0xd bytes 
ntdll.dll!RtlUserThreadStart()  + 0x1d bytes    

My guess is that the Application has already destroyed its COM objects, of which some references were passed to the managed plugin - and the call to the IUnknown::Release the RCW makes makes it go boom.

I can clearly see in the output window (VS) that the app has already started unloading some of it's dll's.

'TestHost.exe': Unloaded 'C:\Windows\System32\msls31.dll'
'TestHost.exe': Unloaded 'C:\Windows\System32\usp10.dll'
'TestHost.exe': Unloaded 'C:\Windows\System32\riched20.dll'
'TestHost.exe': Unloaded 'C:\Windows\System32\version.dll'
First-chance exception at 0x00000001400cea84 in VST3PluginTestHost.exe: 0xC0000005: Access violation reading location 0xffffffffffffffff.
First-chance exception at 0x00000001400cea84 in VST3PluginTestHost.exe: 0xC0000005: Access violation reading location 0xffffffffffffffff.
Managed Debugging Assistant 'ReportAvOnComRelease' has detected a problem in 'C:\Program Files\Steinberg\VST3PluginTestHost\VST3PluginTestHost.exe'.
Additional Information: An exception was caught but handled while releasing a COM interface pointer through Marshal.Release or Marshal.ReleaseComObject or implicitly after the corresponding RuntimeCallableWrapper was garbage collected. This is the result of a user refcount error or other problem with a COM object's Release. Make sure refcounts are managed properly.  The COM interface pointer's original vtable pointer was 0x406975a8. While these types of exceptions are caught by the CLR, they can still lead to corruption and data loss so if possible the issue causing the exception should be addressed

So I though I would manage the lifetime my self and wrote a ComReference class that calls Marshal.ReleaseComObject. That did not work correctly and after reading up on it I have to agree that calling Marshal.ReleaseComObject in a scenrario where references are passed around freely, is not a good idea. Marshal.ReleaseComObject Considered Dangerous

So the question is: Is there a way to manage this situation in order not to cause AV's when exiting the host application?

Monkhood answered 4/12, 2013 at 9:37 Comment(4)
Unmanaged code has a knack for corrupting the heap, a very common problem. Which has a knack for going undetected for a long time. Until there's a reason to re-visit old allocations, program exit time is such a reason. Clearly tinkering with your managed code isn't going to solve this problem.Mastat
I don't think this is a problem of bad unmanaged code. The problem is, is that the host app COM components are already destroyed when the clr shuts down and tries to free its RCW's and calls IUnknown::Release on an invalid interface pointer...Monkhood
Well that's not entirely impossible, but of course a COM component should never unload until its last reference count is counted down. Which is what the finalizer does. Pay attention to the debugger notifications, it tells you when a DLL gets unloaded.Mastat
So you're saying that the app should defer shutting down until all references to its COM components are released? But if the app doesn't shut down, the CLR will never unload and there is no telling when that Finalizer thread will run... Seems to me you just created another (bigger) problem...Monkhood
H
2

There are only three real solutions to this problem, and I think that interpretting the "Marshall.ReleaseComObject considered dangerous" article as "Don't use Marshall.ReleaseComObject" can mislead you. Your takeaway could just as easily have been "don't share RCWs freely".

Your three solutions are:

1: Change the execution of your host application to unload plugins before it unloads itself. That's easier said than done. If the plugin system of the host process includes a shutdown event, that would be a good place to deal with it. All of your services that are holding on to RCWs need to release them during shutdown.

2: Use Marshall.ReleaseComObject in a Dispose()-like pattern, ensuring that objects are only stored within a local scope in a manner similar to a using block. This is straight-forward to implement, allows you to release the COM references deterministically, and is generally a very good first approach.

3: Use a COM object broker that can hand out reference counted instances of RCWs and then release those objects when no one is using them. Ensure that every consumer of those objects clean-up prior to the application unloading.

Option #2 works fine as long as you don't store/share references to the managed RCW. I would use #2 up until you identify that your COM object has high activation costs and that caching/sharing is relevant.

Hartzell answered 13/1, 2014 at 15:53 Comment(1)
There are two occasions where I encountered the problem: At shutdown and during a call that is called very frequently that passes an interface reference. Your 3 points will not help the shutdown issue - because it is the host-implemented objects that are cleaned up before the CLR shuts down. The second scenario baffles me. I would have thought that the reference that is passed would have a new RCW every time. That reference is only used within the context of that method. And if the same object is used by the host (which I suspect it is), that RCW should keep on living, shouldn't?Monkhood
T
1

This is a problem with the native COM reference counts. Your object is being Release()d from native code with refcount=1, it is destroyed, then the CLR comes along and tries Release() it. You need to track down where the reference count is going wrong. It crashes in the CLR because it runs cleanup after the native code is finished.

First step is to track down the type of object that isn't being counted properly. I did this by running gflags.exe against my .exe file and turn on "User mode stack traces". "Full page heap" may help also.

Run the application in windbg. Run .symfix. Run bp clr!SafeReleasePreemp "r rcx; gc"; g to log the interface pointers. When it crashes, the previous log entry should contain the interface pointer that was already destroyed. Run !heap -p -a [address of COM pointer] and it will print the stack of where it was released.

If you're unlucky, it won't crash right away, and the interface pointer that is causing problems won't be the most recent log. If you can run your native COM under the Debug configuration, it may help.

MS made the RCW header available. The members m_pIdentity (offset 0x88 on x64) and m_aInterfaceEntries (offset 0x8 on x64) are of interest. The RCW is in @rdx on entry to SafeReleasePreemp

Next step is to rerun with breakpoints on Interface::AddRef, Interface::QueryInterface, and Interface::Release to see which one is mismatched. _ATL_DEBUG_INTERFACES may help if you're using ATL.

Tenno answered 4/12, 2017 at 23:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.