Repeated destructor calls and tracking handles in C++/CLI
Asked Answered
E

2

16

I'm playing around with C++/CLI, using the MSDN documentation and the ECMA standard, and Visual C++ Express 2010. What struck me was the following departure from C++:

For ref classes, both the finalizer and destructor must be written so they can be executed multiple times and on objects that have not been fully constructed.

I concocted a little example:

#include <iostream>

ref struct Foo
{
    Foo()  { std::wcout << L"Foo()\n"; }
    ~Foo() { std::wcout << L"~Foo()\n"; this->!Foo(); }
    !Foo() { std::wcout << L"!Foo()\n"; }
};

int main()
{
    Foo ^ r;

    {
        Foo x;
        r = %x;
    }              // #1

    delete r;      // #2
}

At the end of the block at #1, the automatic variable xdies, and the destructor is called (which in turn calls the finalizer explicitly, as is the usual idiom). This is all fine and well. But then I delete the object again through the reference r! The output is this:

Foo()
~Foo()
!Foo()
~Foo()
!Foo()

Questions:

  1. Is it undefined behavior, or is it entirely acceptable, to call delete r on line #2?

  2. If we remove line #2, does it matter that r is still a tracking handle for an object that (in the sense of C++) no longer exists? Is it a "dangling handle"? Does its reference counting entail that there will be an attempted double deletion?

    I know that there isn't an actual double deletion, as the output becomes this:

    Foo()
    ~Foo()
    !Foo()
    

    However, I'm not sure whether that's a happy accident or guaranteed to be well-defined behaviour.

  3. Under which other circumstances can the destructor of a managed object be called more than once?

  4. Would it be OK to insert x.~Foo(); immediately before or after r = %x;?

In other words, do managed objects "live forever" and can have both their destructors and their finalizers called over and over again?


In response to @Hans's demand for a non-trivial class, you may also consider this version (with destructor and finalizer made to conform to the multiple-call requirement):

ref struct Foo
{
    Foo()
    : p(new int[10])
    , a(gcnew cli::array<int>(10))
    {
        std::wcout << L"Foo()\n";
    }

    ~Foo()
    {
        delete a;
        a = nullptr;

        std::wcout << L"~Foo()\n";
        this->!Foo();
    }

    !Foo()
    {
        delete [] p;
        p = nullptr;

        std::wcout << L"!Foo()\n";
    }

private:
    int             * p;
    cli::array<int> ^ a;
};
Emmy answered 2/9, 2012 at 22:31 Comment(2)
Printing strings in methods has little to do with what the runtime actually does. The point of writing a destructor and finalizer is to actually do something meaningful. Yes, you're allowed to call this->!Foo, behold the power. That doesn't actually have anything to do with the rulez the GC uses to call the finalizer. Aim gun at foot, pull trigger. Real code dies on a NRE or AV.Oxfordshire
@HansPassant: So what's the meaning of that standard clause then?Emmy
O
19

I'll just try to address the issues you bring up in order:

For ref classes, both the finalizer and destructor must be written so they can be executed multiple times and on objects that have not been fully constructed.

The destructor ~Foo() simply auto-generates two methods, an implementation of the IDisposable::Dispose() method as well as a protected Foo::Dispose(bool) method which implements the disposable pattern. These are plain methods and therefore may be invoked multiple times. It is permitted in C++/CLI to call the finalizer directly, this->!Foo() and is commonly done, just like you did. The garbage collector only ever calls the finalizer once, it keeps track internally whether or not that was done. Given that calling the finalizer directly is permitted and that calling Dispose() multiple times is allowed, it is thus possible to run the finalizer code more than once. This is specific to C++/CLI, other managed languages don't allow it. You can easily prevent it, a nullptr check usually gets the job done.

Is it undefined behavior, or is it entirely acceptable, to call delete r on line #2?

It is not UB and entirely acceptable. The delete operator simply calls the IDisposable::Dispose() method and thus runs your destructor. What you do inside it, very typically calling the destructor of an unmanaged class, may well invoke UB.

If we remove line #2, does it matter that r is still a tracking handle

No. Invoking the destructor is entirely optional without a good way to enforce it. Nothing goes wrong, the finalizer ultimately will always run. In the given example that will happen when the CLR runs the finalizer thread one last time before shutting down. The only side effect is that the program runs "heavy", holding on to resources longer than necessary.

Under which other circumstances can the destructor of a managed object be called more than once?

It's pretty common, an overzealous C# programmer may well call your Dispose() method more than once. Classes that provide both a Close and a Dispose method are pretty common in the framework. There are some patterns where it is nearly unavoidable, the case where another class assumes ownership of an object. The standard example is this bit of C# code:

using (var fs = new FileStream(...))
using (var sw = new StreamWriter(fs)) {
    // Write file...
}

The StreamWriter object will take ownership of its base stream and call its Dispose() method at the last curly brace. The using statement on FileStream object calls Dispose() a second time. Writing this code so that this doesn't happen and still provide exception guarantees is too difficult. Specifying that Dispose() may be called more than once solves the problem.

Would it be OK to insert x.~Foo(); immediately before or after r = %x;?

It's okay. The outcome is unlikely to be pleasant, a NullReferenceException would be the most likely result. This is something that you should test for, raise an ObjectDisposedException to give the programmer a better diagnostic. All standard .NET framework classes do so.

In other words, do managed objects "live forever"

No, the garbage collector declares the object dead, and collects it, when it cannot find any references to the object anymore. This is a fail-safe way to memory management, there is no way to accidentally reference a deleted object. Because doing so requires a reference, one that the GC will always see. Common memory management problems like circular references are not an issue either.

Code snippet

Deleting the a object is unnecessary and has no effect. You only delete objects that implement IDisposable, an array does not do so. The common rule is that a .NET class only implements IDisposable when it manages resources other than memory. Or if it has a field of a class type that itself implements IDisposable.

It is furthermore questionable whether you should implement a destructor in this case. Your example class is holding on to a rather modest unmanaged resource. By implementing the destructor, you impose the burden on the client code to use it. It strongly depends on the class usage how easy it is for the client programmer to do so, it definitely is not if the object is expected to live for a long time, beyond the body of a method so that the using statement isn't usable. You can let the garbage collector know about memory consumption that it cannot track, call GC::AddMemoryPressure(). Which also takes care of the case where the client programmer simply doesn't use Dispose() because it is too hard.

Oxfordshire answered 9/9, 2012 at 15:56 Comment(10)
Thank you, a very fine answer!Emmy
A question of style: Would you recommend using automatic objects whenever possible, no matter whether they're managed or unmanaged? It seems to me like having a destructor (like in my code snippet) isn't a problem if we use only automatic variables. And we can even write a unique_cli_ptr template :-)Emmy
+1 -- but "The finalizer will ultimately always run" is false. See blogs.msdn.com/b/oldnewthing/archive/2010/08/09/10047586.aspxRelinquish
@Kerrek: No, because if that "automatic object" is called by anyone other than C++/CLI (e.g. C#) the semantics you want are difficult for that code to follow. C++/CLI is not C++. Code in C++/CLI looks far more like C#, because idioms and such are driven by the CLR, not by traditional ideas from C++.Relinquish
@BillyONeal: Ooh, I see. I was trying to view C++/CLI as a "beefed up" version of C++. Is it actually useful as a language in its own right (e.g. because I just want to use the .NET framework), or is it purely for interop modules?Emmy
@Kerrek: I would use it for interop modules only. And even then, only if you are needing to interop with C++ classes. If you only need to interop with C than I'd use P/Invoke or unsafe blocks instead in C#. C++/CLI makes it far too easy to screw up and use a pointer somewhere in CLR land, which instantly makes your whole module incapable of operating in partial trust environments, like Silverlight, Windows Phone 7, etc. It can also be confusing because of its' "dual view of the world" that you're seeing here. There are CLR bits and native bits, and the semantics are very different.Relinquish
@Kerrek: But that's just my opinion. :) YMMVRelinquish
@BillyONeal: Much appreciated :-) I only recently started playing with CLI after I accidentally downloaded VS-Express 2010. Then I got interested because Herb Sutter's "rationale for C++/CLI" paper made it sound very exciting.Emmy
The C++/CLI term for "automatic objects" is stack semantics. Write the reference type variable without the hat. Yes, you'd definitely use it since there is no equivalent of the using statement. The smart pointer type already exists, msclr::auto_handle msdn.microsoft.com/en-us/library/ms177065.aspxOxfordshire
@Billy: There are times when you have to use p/invoke (e.g. mixed assemblies aren't possible on the Compact Framework), but I would never choose p/invoke for calling C APIs on a platform that supports C++/CLI. RAII is just so much better for keeping tract of resources than anything C# offers. Combine that with the ability to use the header file as intended, and C++/CLI is a clear win in terms of ease of coding, maintainability, and correctness.Ginseng
G
1

Guidelines from standard C++ still apply:

  1. Calling delete on an automatic variable, or one that's already been cleaned up, is still a bad idea.

  2. It's a tracking pointer to a disposed object. Dereferencing such is a bad idea. With garbage collection, the memory is kept around as long as any non-weak reference exists, so you can't access the wrong object by accident, but you still can't use this disposed object in any useful way, since its invariants probably no longer hold.

  3. Multiple destruction can only happen on managed objects when your code is written in really bad style that would have been UB in standard C++ (see 1 above and 4 below).

  4. Explicitly calling the destructor on an automatic variable, then not creating a new one in its place for the automatic destruction call to find, is still a bad idea.

In general, you think think of object lifetime as separate from memory allocation (just like standard C++ does). Garbage collection is used to manage deallocation -- so the memory is still there -- but the object is dead. Unlike standard C++, you can't go and reuse that memory for raw byte storage, because parts of the .NET runtime may assume the metadata is still valid.

Neither the garbage collector nor "stack semantics" (automatic variable syntax) use reference counting.

(Ugly details: disposing an object doesn't break the .NET runtime's own invariants concerning that object, so you can probably even still use it as a threading monitor. But that just makes an ugly hard-to-understand design, so please don't.)

Ginseng answered 3/9, 2012 at 21:14 Comment(6)
2: Not true on the CLR. The semantics of a disposed object is an object that still exists in CLR land. 3. Actually, if your code is called from C#, there are a number of common patterns that cause multiple calls to Dispose all the time.Relinquish
@Billy: The object lifetime has ended in the C++ sense (the destructor has run). There may be an underlying CLR object still existing, but treating it as a C++ object would be bad.Ginseng
It isn't a C++ object. ref classes are never C++ objects, and do not follow C++ rules.Relinquish
@Billy: As much as possible, the C++ compiler enforces both the rules of the CLR and of C++ on managed types defined in C++/CLI. Only using objects in between the constructor and destructor call is just much easier to reason about anyway, so worrying about what the object is good for after destruction is a moot point.Ginseng
Code other than C++/CLI can call ref class types. As such, it needs to follow CLR semantics, not C++ semantics. CLR semantics allow Dispose to be called multiple times; ergo a correctly written C++/CLI ref class must also allow dispose to be called multiple times. Even C++/CLI causes dispose to be called more than once on a regular basis; e.g. if there are a stream and TextReader, where the TextReader owns the stream, where they are both automatic variables, C++/CLI ends up calling dispose on the stream twice.Relinquish
@Billy: If you declare an automatic variable and then pass it to a function that takes ownership, you violate the C++ lifetime rules, and you can no longer expect the usual guarantees of destructor called exactly once to hold. True in C++, also true in C++/CLI.Ginseng

© 2022 - 2024 — McMap. All rights reserved.