Do I absolutely need to call ReleaseComObject on every MSHTML object?
Asked Answered
S

2

7

I'm using MSHTML with a WebBrowser control because it gives me access to things the WebBrowser doesn't such as text nodes. I've seen several posts here and on the web where people say you must call ReleaseComObject for every COM object you reference. So, say I do this:

var doc = myBrowser.Document.DomDocument as IHTMLDocument2;

Do I need to release doc? How body in this code:

var body = (myBrowser.Document.DomDocument as IHTMLDocument2).body;

Aren't these objects wrapped by a RCW that would release them as soon as there are no more references to them? If not, would it be a good idea to create a wrapper for each of them with a finalizer (instead of using Dispose) that would release them as soon as the garbage collector kicks in (such that I don't need to worry about manually disposing them)?

The thing is, my application has a memory leak and I believe is related to this. According to ANTS memory profiler, one of the functions (among many others that happen to use MSHTML objects) that is holding a reference to a bunch of Microsoft.CSharp.RuntimeBinder.Semantics.LocalVariableSymbol objects which are on the top list of objects using memory in Generation 2 is this one:

internal static string GetAttribute(this IHTMLDOMNode element, string name)
{
    var attribute = element.IsHTMLElement() ? ((IHTMLElement)element).getAttribute(name) : null;
    if (attribute != null) return attribute.ToString();
    return "";
}

Not sure what's wrong here since attribute is just a string.

Here is another function that is shown on the ANTS profiler's Instance Retention Graph (I added a bunch of FinalReleaseComObjects but is still shown):

private void InjectFunction(IHTMLDocument2 document)
{
    if (null == Document) throw new Exception("Cannot access current document's HTML or document is not an HTML.");

    try
    {
        IHTMLDocument3 doc3 = document as IHTMLDocument3;
        IHTMLElementCollection collection = doc3.getElementsByTagName("head");
        IHTMLDOMNode head = collection.item(0);
        IHTMLElement scriptElement = document.createElement("script");
        IHTMLScriptElement script = (IHTMLScriptElement)scriptElement;
        IHTMLDOMNode scriptNode = (IHTMLDOMNode)scriptElement;
        script.text = CurrentFuncs;
        head.AppendChild(scriptNode);
        if (Document.InvokeScript(CurrentTestFuncName) == null) throw new Exception("Cannot inject Javascript code right now.");
        Marshal.FinalReleaseComObject(scriptNode);
        Marshal.FinalReleaseComObject(script);
        Marshal.FinalReleaseComObject(scriptElement);
        Marshal.FinalReleaseComObject(head);
        Marshal.FinalReleaseComObject(collection);
        //Marshal.FinalReleaseComObject(doc3);
    }
    catch (Exception ex)
    {
        throw ex;
    }
}

I added the ReleaseComObject but the function seems to still be holding a reference to something. Here is how my function looks like now:

private void InjectFunction(IHTMLDocument2 document)
{
    if (null == Document) throw new Exception("Cannot access current document's HTML or document is not an HTML.");

    try
    {
        IHTMLDocument3 doc3 = document as IHTMLDocument3;
        IHTMLElementCollection collection = doc3.getElementsByTagName("head");
        IHTMLDOMNode head = collection.item(0);
        IHTMLElement scriptElement = document.createElement("script");
        IHTMLScriptElement script = (IHTMLScriptElement)scriptElement;
        IHTMLDOMNode scriptNode = (IHTMLDOMNode)scriptElement;
        script.text = CurrentFuncs;
        head.AppendChild(scriptNode);
        if (Document.InvokeScript(CurrentTestFuncName) == null) throw new Exception("Cannot inject Javascript code right now.");
        Marshal.FinalReleaseComObject(scriptNode);
        Marshal.FinalReleaseComObject(script);
        Marshal.FinalReleaseComObject(scriptElement);
        Marshal.FinalReleaseComObject(head);
        Marshal.FinalReleaseComObject(collection);
        Marshal.ReleaseComObject(doc3);
    }
    catch (Exception ex)
    {
        MessageBox.Show("Couldn't release!");
        throw ex;
    }
}

The MessageBox.Show("Couldn't release!"); line is never hit so I assume everything is been released properly. Here is what ANTS shows:

ANTS memory profiler screenshot

I have no idea what that site container thing is.

Sotelo answered 9/2, 2013 at 15:34 Comment(1)
If you have COM objects in a method, always clean them up before throwing exceptions, since they can't be cleaned up after the exception is thrown...Craiova
C
7

The RCW will release the COM object when the RCW is finalized, so you don't need to create a wrapper that does this. You call ReleaseComObject because you don't want to wait around for the finalization; this is the same rationale for the Dispose pattern. So creating wrappers that can be Disposed isn't a bad idea (and there are examples out there

For var doc = myBrowser.Document.DomDocument ...;, you should also capture .Document in a separate variable and ReleaseComObject it as well. Any time you reference a property of a COM object which produces another object, make sure to release it.

In GetAttribute, you're casting the element to another interface. In COM programming, that adds another reference. You'll need to do something like var htmlElement = (IHTMLElement) element; so you can release that as well.

Edit - this is the pattern to use when working with COM objects:

IHTMLElement element = null;
try
{
    element = <some method or property returning a COM object>;
    // do something with element
}
catch (Exception ex) // although the exception type should be as specific as possible
{
    // log, whatever

    throw; // not "throw ex;" - that makes the call stack think the exception originated right here
}
finally
{
    if (element != null)
    {
        Marshal.ReleaseComObject(element);
        element = null;
    }
}

This should really be done for every COM object reference you have.

Craiova answered 9/2, 2013 at 15:51 Comment(7)
Does this mean instead of releasing them I can just do GC.Collect() every once in a while and the RCW wrappers will take care of every COM object that doesn't have a reference anymore (this seems way easier than having to manually release them everywhere I use them which is like a thousand places)?Sotelo
I tried what you suggested regarding my GetAttribute method and I got an COM object that has been separated from its underlying RCW cannot be used. error. Seems that doing that releases the whole node not just the IHTMLElement interface.Sotelo
You definitely should not call GC.Collect(), pretty much ever. It won't even release COM objects when you call it, necessarily - only the ones that have already been marked for finalization. Here is a CodeProject article that discusses a more automated way of freeing COM objects when you're done with them. The bottom line, though, is that when you combine two systems with very different methods of memory management, there is bound to be pain.Craiova
I see you use FinalReleaseComObject quite a bit. Use ReleaseComObject instead. FinalReleaseComObject calls IUnknown::Release() until the reference count is zero - so if you call it, say, on an element that has been casted to IHTMLElement, then, sure, the original object is going to get released. That's what FinalReleaseComObject is for - to say you're really, finally done with the object.Craiova
myBrowser.Document is a System.Windows.Forms.HtmlDocument not a COM object, IIRC. So ReleaseComObject cannot be called for it.Dalrymple
@PaulB. - The documentation says the Document property is a System.Object, must be cast to the expected COM interface, and requires permission to access unmanaged code. You may be right, but these don't seem like typical attributes of .NET objects.Craiova
@Craiova - Thanks for your input. Seems it's different for WinForms. So let's agree we are both right :-)Dalrymple
A
1

Probably this article brings in some light:

MSDN on how COM refcounting works and some basic rules when to call AddRef and Release

In your case, Release is ReleaseComObject

Alleluia answered 9/2, 2013 at 20:43 Comment(1)
Absolutely. A "reference" is something very different to a COM object than to a .NET object, and understanding the difference is crucial. Thanks!Craiova

© 2022 - 2024 — McMap. All rights reserved.