What do Clang and GCC do when `delete`ing base classes with non-virtual destructors?
Asked Answered
T

2

3

There is already a question asking about the "real-world" behavior of deleteing a pointer to a base class that lacks a virtual destructor, but the question is restricted to a very limited case (the derived class has no members with non-trivial destructors), and the accepted answer just says there's no way to know without checking the behavior of every compiler.

....but that isn't actually very helpful; knowing that every compiler might behave differently doesn't tell us anything about the behavior of any particular compiler. So, what do Clang and G++ do in this case? I would assume they would simply call the base-class destructor, then deallocate the memory (for the entire derived class). Is this the case?

Or, if it's not possible to determine this for all versions of GCC and Clang, how about GCC 4.9 and 5.1, and Clang 3.5 through 3.7?

Therefore answered 21/8, 2015 at 3:47 Comment(14)
The same principle applies: Holy and Sacred Undefined Behaviour gets summoned :).Covetous
What would be the point of figuring this out? It's undefined behavior, and for all you know, the behavior could change the next time you recompile, or change the order of data members, or add a new one, or just about do anything ...Kruller
@Praetorian: that's true if you simply observe the behaviour, but if you analyse the code carefully you might be able to make a more conclusive statement relevant to a specific version of the compiler. But then the next release could be completely different, and who wants an application that might break with any compiler release or patch, let alone port? Anyway, not sure why Kyle would expect anyone here to do the research for him....Ewe
@KemyLand: why do you want to know this? There are sound reasons - if you've released code with such a bug and want to know whether to rush out a patch or just fix it for the next release, but there are poor and outright bogus reasons too that people here might be able to shed light on.Ewe
@Kruller Suppose I have a program that seems to run reliably, but upon inspecting the code I notice a subtle case of UB. If I knew that a memory leak was the worst thing that could happen in the particular scenario, I'd probably just continue to run the program as needed. But "anything can happen" implies that even running a program you've run before without any noticeable ill effect might destroy your hard drive.Therefore
@TonyD StackOverflow is almost always my first stop for this kind of research; if someone already knows the answer, that's great, and everyone can win fake internet points. If I end up finding it on my own, I'll post it here and get double the fake internet points. Why the snide comment?Therefore
@Kruller Actually, Tony's example of having released code with UB is even better than my personal-use example.Therefore
Agreed, Tony's example is a good one, and there might be value to investigating the effects of a specific case, but your question is asking about how these compilers behave in general in the face of UB. That, IMHO, is much less valuable to quantify. (Kinda addressing your other question too) While nasal demons might be far fetched, attackers frequently exploit UB for privilege escalation attacks and arbitrary code execution, so anything a computer is capable of doing is actually within the realm of possibility when you have UB.Kruller
@KyleStrand: thing is this is just too specific to compiler versions, with tenuous circumstances in which someone might care, for anyone to be likely to know off the top of their head (for both compilers) or to be of general use to the community.... And you've obviously got at least some of the source, if you can fix, recompile and distribute in a reasonable timeframe, problem solved.Ewe
@Kruller I'm asking specifically about the destruction-via-pointer-to-base issue, which seems pretty specific to me. (The other question, as per Merhdad's answer, was just a misunderstanding on my part.)Therefore
The quest here is noble - I've seen many SO questions get answered because a certain cause of UB has a familiar smell that leads good programmers to the source of the problem. But what makes this question not work here, I think, is that for even one version of one compiler, an UB is not typically tested for consistency. Someone would need to prove that your delete scenario is the same for all build options, and CPU types, and optimization levels, and so on.Gooch
@DrewDormann I was thinking (wishfully) that the algorithm for generating code from the delete statement, but I guess you're right--even in very limited areas of the language, I suppose it's impossible or nearly impossible to know exactly what can happen for every compilation scenario.Therefore
I guess it's possible that a particular implementor might extend the language and make certain guarantees. However, this would be clearly documented in the compiler's docs. So, if you can't find something in the compiler doc, then it's just UBCensorious
@KyleStrand "I was thinking (wishfully) that the algorithm for generating code from the delete statement," That algorithm would have to include every transformation from the abstract tree to the final code, esp. including every optimisation step. You are essentially asking for a complete description of the most complex part of the compiler (which is obviously not answerable here). Or maybe you only want to know about the common case, with no code transformation?Handtomouth
C
4

First, the standard disclaimer: this is undefined behavior, so even with one specific compiler, changing the compiler flags, the day of the week, or the way you look at the computer could change the behavior.

The following all assumes you have some sort of at least slightly non-trivial destruction happening in your destructors (e.g., the objects delete some memory, or contain object others that themselves delete some memory).

In the simple case (single inheritance) you typically get something roughly equivalent to static binding--that is, if you destroy a derived object via a pointer to a base object, only the base constructor is invoked so the object isn't destroyed properly.

If you use multiple inheritance, and you destroy an object of derived class via the "first" base class, it'll typically be about the same as if you used single inheritance--the base class destructor will be invoked, but the derived class destructor won't be.

If you have multiple inheritance and destroy a derived object via a pointer to the second (or subsequent) base class, your program will typically crash. With multiple inheritance, you have multiple base class objects at multiple offsets in the derived object.

enter image description here

In the typical case, the first base class will be at the beginning of the derived object, so using the address of derived as a pointer to the first base class object works about the same as in the single inheritance case--we get the equivalent of static binding/static dispatch.

If we try this with any of the other base classes, a pointer to the derived doesn't point to an object of that base class. The pointer needs to be adjusted to point to the second (or subsequent) base class before it can be used as a pointer to that type of object at all.

With a non-virtual destructor, what'll typically happen is that the code will basically take that address of that first base class object, do roughly the equivalent of a reinterpret_cast on it, and try to use that memory as if it were an object of the base class specified by the pointer (e.g., base2). For example, let's assume base2 has a pointer at offset 14, and base2's destructor attempts to delete a block of memory it points at. With a non-virtual destructor, it'll probably receive a pointer to the base1 subject--but it'll still look at offset 14 from there, and try to treat that as a pointer, and pass it to delete. It could be that base1 contains a pointer at that offset, and it's actually pointing at some dynamically allocated memory, in which case this might actually appear to succeed. Then again, it could also be that it's something entirely different, and the program dies with an error message about (for example) attempting to free an invalid pointer.

It's also possible that base1 is smaller that 14 bytes in size, so this ends up actually manipulating (say) offset 4 in base2.

Bottom line: for a case like this, things get really ugly in a hurry. The very best you can hope for is that the program dies quickly and loudly.

Just for kicks, quick demo code:

#include <iostream>
#include <string>
#include <vector>

class base{ 
    char *data;
    std::string s;
    std::vector<int> v;
public:
    base() { data = new char;  v.push_back(1); s.push_back('a'); }
    ~base() { std::cout << "~base\n"; delete data; }
};

class base2 {
    char *data2;
public:
    base2() : data2(new char) {}
    ~base2() { std::cout << "~base2\n"; delete data2; }
};

class derived : public base, public base2 { 
    char *more_data;

public:
    derived() : more_data(new char) {}
    ~derived() { std::cout << "~derived\n"; delete more_data; }
};

int main() {
    base2 *b = new derived;
    delete b;
}

g++/Linux: Segmentation fault
clang/Linux: Segmentation fault
VC++/Windows: Popup: "foo.exe has stopped working" "A problem caused the program to stop working correctly. Please close the program."

If we change the pointer to base instead of base2, we get ~base from all the compilers (and if we derive only from one base class, and use a pointer to that base class, we get the same: only that base class' destructor runs).

Confiscable answered 10/6, 2016 at 0:17 Comment(4)
I don't see how you'd get reinterpret_cast-like behavior. Implicitly converting a pointer to derived to a pointer to (second+) base will do the appropriate adjustment already. The problem is that said adjustment means that you'll pass the wrong pointer to the deallocation function later.Dumpish
@T.C.: Could be--it's pretty hard to give solid reasoning about what will happen when the relevant documents directly disclaim any ability to reason about what's really going on/going to happen.Confiscable
What do you mean by "single inheritance" exactly?Handtomouth
@curiousguy: single inheritance is when a class has exactly one direct base class. So, if you have class foo : public bar, that's single inheritance (even if bar in turn derives from baz). If you have something like class foo : public bar, public baz, that's multiple inheritance.Confiscable
H
-6

If you delete an object without a virtual destructor, the compiler will probably assume that the deleted address is the address of the most derived object.

Unless you use a primary base class to delete the object, this won't be the case, so the compiler will call operator delete with an incorrect address.

Of course the compiler will not call the destructor of the derived class, or operator delete of the derived class (if there is one).

Handtomouth answered 27/8, 2015 at 4:50 Comment(28)
The compiler will assume the address is of the type that was deleted and not the most derived object. You might want to fix that.Kioto
@Kioto If the destructor is not virtual, the compiler assumes that the pointers given to delete is exactly the pointer returned by new, with the same type. (Not a converted pointer.) This means that the pointer points to the most derived object.Handtomouth
What exactly do you mean by "primary" base class? Is that a term from the standard? I'm not familiar with it.Therefore
@KyleStrand Basically, given struct B1 { int i; }; struct B2 { int j; }; struct D : B1, B2 { }; D *d = new D; B2 *b2 = d;, it's likely that static_cast<void*>(d) != static_cast<void*>(b2), in which case delete b2; is likely to fail badly too. This is exactly what happens with both GCC and clang in the general case, depending on the runtime environment. To be honest, I think it's a trivial case where it can't work that isn't all that interesting, but it does cover the question you asked.Envelopment
@Handtomouth .....but how would the compiler know how new was invoked? The pointer could be deleted e.g. in a different translation unit.Therefore
@KyleStrand Obviously the compiler has no way of knowing. As I wrote "the compiler assumes that the pointers given to delete is exactly the pointer returned by new, with the same type. (Not a converted pointer.)" The compiler doesn't try to guess, it just assumes that the numerical value passed to delete is exact value returned by new. It is your responsibility as a programmer to guarantee that.Handtomouth
@Handtomouth Oh--you're just talking about the numeric value of the pointer. But that's not all the information that's encoded in a pointer: the type of a pointer does affect the code generated for delete, because the destructor for the object being deleted must also be invoked. (Note that this information is obviously not encoded in the runtime data of the pointer; it's only available at compile-time.)Therefore
@KyleStrand ""primary" base class" isn't a standard term; it just means the base class subobject with the same numerical address as the derived object.Handtomouth
@Handtomouth No need to get snippy or insulting. You wrote: "...assumes that the numerical value passed to delete is exact value returned by new" (emphasis mine). But the numeric value does not change depending on the type of the pointer! You can test this by using new to create a derived type, assign a base-type pointer to the derived pointer, cast both pointers to uintprt_t, and assert that the integral types are equal. Then delete the pointer-to-base type; if the base-type's destructor is not virtual, the derived destructor will not be called.Therefore
@KyleStrand When was I insult? Do you even read my answers? I wrote: """primary" base class" isn't a standard term; it just means the base class subobject with the same numerical address as the derived object" "Unless you use a primary base class to delete the object, this won't be the case" !!!!!!!Handtomouth
@Handtomouth "Do you have trouble reading" is pretty insulting. You're still talking about the value of the pointer addresses. As per the example in my previous comment, that's not what determines which destructor is called.Therefore
Sep 28 '15 at 6:55 "If the destructor is not virtual, the compiler assumes that the pointers given to delete is exactly the pointer returned by new, with the same type" Jun 2 at 1:56 "As I wrote "the compiler assumes that the pointers given to delete is exactly the pointer returned by new, with the same type. (Not a converted pointer.)"" Jun 3 at 5:39 "As I wrote: "As I wrote (snip)""Handtomouth
"You're still talking about the value of the pointer addresses. As per the example in my previous comment, that's not what determines which destructor is called." At no point I have said or implied that the value of the pointer address determine which destructor is called when the destructor is non-virtual. I have said that passing correct numerical pointer value is essential. I have said that when the pointer points to a base class subobject, the numerical pointer value will be different in SOME cases. I have been very clear about that. EVERYTHING has been explained to you.Handtomouth
@Handtomouth Clearly I am not satisfied, so either (1) I am an idiot, (2) I am maliciously refusing to understand you, or (3) you are not being as clear as you think you are. You are free to assume option (1), but I'd prefer you keep that comment to yourself.Therefore
I am still trying to work out the reason why you are not satisfied. My answer may not be the best in term of getting the point across, but nothing I wrote is incorrect. I am not assuming you are an idiot, just like your brain is "stuck" which is different. You know what happens when you reread ten times your essay which contains a grammatical error and you don't see it? Who hasn't experienced that several times? I also don't think you are malicious. I clearly wrote, several times, that you must pass a pointer to delete with the exact same type that was returned by new!Handtomouth
@KyleStrand "But the numeric value does not change depending on the type of the pointer!" Actually it will change, as I said in my answer "Unless you use a primary base class". You even don't need to know what a primary base is to understand that I am saying the address of a base class subject will sometimes be same and sometimes different. That was an essential point in my answer which you apparently feel you need to dismiss or ignore or contradict. Please tell me which it is.Handtomouth
@KyleStrand "Clearly I am not satisfied" Clearly you are not satisfied and I am not satisfied knowing you are not satisfied, and maybe: either (1) I am an idiot, (2) I am maliciously refusing to understand you, or (3) you are not being as clear as you think you are.Handtomouth
I think perhaps when you say "the compiler will probably assume that the deleted address is the address of the most derived object", you mean something different than what meneldal and I take that expression to mean. It sounds like you're saying that the compiler knows (based on the address) what the "most derived object" for any given pointer is, even if the pointer type is a base class, and that the compiler therefore invokes the most derived object's destructor.Therefore
Obviously this would mean virtual destructors are unnecessary in a single-inheritance chain as long as the implementation guarantees that the addresses of all derived-class objects are always the address of the top-level base-class, so that all sub-objects share the same root address. (I'm not sure if this is guaranteed by the standard, but as shown by my example above, this is the case for GCC in a simple example with a single base class and a single derived class.)Therefore
But in fact, even when the address of the pointer passed to delete is the address of the most-derived object, but the pointer type is the base class type, the wrong destructor is invoked, again as per my simple example above. So the confusion arises because your comment about addresses does not in fact address the crucial issue of the type of the pointer passed to delete.Therefore
@KyleStrand You are just not reading what I wrote, rewrote, and rerewrote... the compiler assumes, as in it is taking something for granted. "the compiler assumes that the pointers given to delete is exactly the pointer returned by new, with the same type. (Not a converted pointer.)" Read slowly with the fingerHandtomouth
@Handtomouth I still think that the word "assumes" is somewhat poorly chosen, but that's not the main issue. The issue is that your answer, as it stands, is wrong, and adding extra comments doesn't change that. First, your answer still says that the value of the pointer is the main issue, when in fact the main issue is the type of the pointer.Therefore
Second, your answer states that "Unless you use a primary base class to delete the object, [the subobject addresses will differ], so the compiler will call operator delete with an incorrect address." This simply isn't true for single-inheritance, as my example (again!) shows.Therefore
Note that the sentence you (re-)quoted is from a comment, not from your answer.Therefore
I have (minor) quibbles with the quoted sentence, though, too. First, you're describing the conditions under which the compiler behavior would be correct, but the word "assumes" seems to imply some sort of anthropomorphized, semantically-meaningful "thought process" on the compiler's part. This isn't a big deal, though--it's basically just a metaphor.Therefore
Second, and more importantly, in order to get the correct behavior, you don't need "exactly the pointer returned by new" -- you need the same value, and the same type. So the pointer can be copied and converted, and arithmetically modified, before deleteing it, and the delete operation will still behave correctly as long as the arithmetic operations ultimately restore the original value and that value is then cast to the correct type.Therefore
@KyleStrand "First, your answer still says that the value of the pointer is the main issue" Well, "main" is a very subjective term. The value of the pointer is a potential issue: using an incorrect pointer value will cause operator delete to be called on an address not returned by operator new, which is nearly always fatal. Even if the program doesn't crash immediately, there is no way to recover after that. It seems a pretty serious issue to me and this is the reason I focused on it. OTOH not calling the correct destructor isn't always fatal.Handtomouth
Let us continue this discussion in chat.Therefore

© 2022 - 2024 — McMap. All rights reserved.