Why is it undefined behavior to delete[] an array of derived objects via a base pointer?
Asked Answered
D

5

40

I found the following snippet in the C++03 Standard under 5.3.5 [expr.delete] p3:

In the first alternative (delete object), if the static type of the object to be deleted is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.


Quick review on static and dynamic types:

struct B{ virtual ~B(){} };
struct D : B{};

B* p = new D();

Static type of p is B*, while the dynamic type of *p is D, 1.3.7 [defns.dynamic.type]:

[Example: if a pointer p whose static type is “pointer to class B” is pointing to an object of class D, derived from B, the dynamic type of the expression *p is “D.”]


Now, looking at the quote at the top again, this would mean that the follwing code invokes undefined behaviour if I got that right, regardless of the presence of a virtual destructor:

struct B{ virtual ~B(){} };
struct D : B{};

B* p = new D[20];
delete [] p; // undefined behaviour here

Did I misunderstand the wording in the standard somehow? Did I overlook something? Why does the standard specify this as undefined behaviour?

Dissipated answered 30/5, 2011 at 2:35 Comment(9)
If you have an array of elements of the derived class and assign that to a pointer to an array of the base class, you no longer have any info to tell you what the element size is.Esque
@Daniel: I knew something was off with my reasoning, but not, that it was something this bad... I somehow associated (vector<B*> v(N)) == (B* p = new B[N]). Without that, this question now makes no sense whatsoever. :|Dissipated
@Xeo: Based on your title, I was going to answer that "deleting an array of pointers is well-defined, but it doesn't delete the objects". However I see you've already realized that. Strange that none of the existing answers caught that.Ameeameer
@Ben: I don't even get how I got 7 upvotes. I'm currently cosidering just deleting this question altogether. Thoughts?Dissipated
Well, with the code in your question, and the answers you got, I think it needs to be retitled... "Why is it undefined behavior to delete[] an array of derived objects via a base pointer?"Ameeameer
@Ben: Yay, my hero! Thanks for rescuing this question. :)Dissipated
@Xeo: You're welcome. Are you currently in Berlin as your profile states? It might be just a little too late at night to ask a coherent question ;)Ameeameer
@Ben: Not just currently but all the time. :) And yeah, this sleep deprivedness (is that even a word?) might be the cause of this.Dissipated
@Xeo: "sleep deprivation"... but now we're getting into english.se territory :)Ameeameer
H
35

Base* p = new Base[n] creates an n-sized array of Base elements, of which p then points to the first element. Base* p = new Derived[n] however, creates an n-sized array of Derived elements. p then points to the Base subobject of the first element. p does not however refer to the first element of the array, which is what a valid delete[] p expression requires.

Of course it would be possible to mandate (and then implement) that delete [] p Does The Right Thing™ in this case. But what would it take? An implementation would have to take care to somehow retrieve the element type of the array, and then morally dynamic_cast p to this type. Then it's a matter of doing a plain delete[] like we already do.

The problem with that is that this would be needed every time an array of polymorphic element type, regardless of whether the polymorphism is used on not. In my opinion, this doesn't fit with the C++ philosophy of not paying for what you don't use. But worse: a polymorphic-enabled delete[] p is simply useless because p is almost useless in your question. p is a pointer to a subobject of an element and no more; it's otherwise completely unrelated to the array. You certainly can't do p[i] (for i > 0) with it. So it's not unreasonable that delete[] p doesn't work.

To sum up:

  • arrays already have plenty of legitimate uses. By not allowing arrays to behave polymorphically (either as a whole or only for delete[]) this means that arrays with a polymorphic element type are not penalized for those legitimate uses, which is in line with the philosophy of C++.

  • if on the other hand an array with polymorphic behaviour is needed, it's possible to implement one in terms of what we have already.

Horology answered 30/5, 2011 at 3:17 Comment(13)
Neither storing the type nor a dynamic type check would be needed. Simply storing the stride alongside the number of elements (which the standard already requires the implementation to remember) is sufficient to find the pointer to each element, the virtual destructor takes care of the rest.Ameeameer
@BenVoigt Hence the morally as in moral equivalent! I was talking in terms of semantics, not implementation. (And I can't mention dynamic_cast without mentioning a type to cast to.)Horology
@Luc: I guess I'm just seeing this as more of a reinterpret_cast than a dynamic_cast.Ameeameer
@Ben I wanted to express this hypothetical delete[] in terms of the regular delete[], which meant finding the start of the array even in the situation of multiple inheritance. It's true that as a consequence it differs from delete in that I seemingly sidestep the virtualness of the base destructor. I could downcast of course. Still I don't think there's a more correct answer to this question: do you destroy the elements of a polymorphic array from the dynamic type, or from the subobjects handled to delete[]?Horology
@Luc: new[] used the dynamic type, and stored the element count for delete[]'s later use. It could also store the element size (same for all). That lets you calculate a (base) pointer to each element, at which point you can call the destructor virtually, just as the compiler does when you use delete ptr_base on a single derived object. The error is as @sth said: Pointer arithmetic on a pointer-to-base uses sizeof (Base) to calculate addresses of subsequent elements. By storing the stride this would be cured. Of course, subscripting through the base pointer would still be broken.Ameeameer
@Luc: Something like: template<typename T> array_delete_helper(T* target, size_t count, size_t stride) { while (count-- > 0) reinterpret_cast<T*>(reinterpret_cast<intptr_t>(target) + count * stride)->~T(); } Where T is the static (base) type.Ameeameer
@Ben Sure. But where do you find the stored count and the stored stride (and even in my case, the element type)? That's exactly why I wanted to avoid discussing the implementation details. And hence why my answer was: "the compiler does magic, as if a dynamic_cast was performed and the original array was retrieved and this was deleted". And please don't answer this comment to mention e.g. it's possible to use a lookup table. That's not the point. I instead encourage you to post an answer to provide your implementation.Horology
@Luc: The standard allows new[] to store this information. Section [expr.new] says "A new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array." And it already has to store the count, to be retrieved by delete[]. A compliant implementation could store the stride as well. Oh, now I see the problem. (to be continued)Ameeameer
delete[] can't locate the metadata if the base subobject (to which it is given a pointer` isn't located at offset zero within the derived object. Still, a conforming implementation could be written to properly delete an array using a base pointer, in the common case of single (and non-virtual) inheritance.Ameeameer
@Luc: OTOH, the derived class pointer has to be passed to the derived class destructor in the non-array delete case, so the v-table (or similar mechanism) has to provide a way to find the derived object. At which point finding the metadata becomes trivial.Ameeameer
@Ben Voigt: actually, delete[] could locate the meta-data as delete does. dynamic_cast<void*>(base) returns the address of the complete object, from there it's easy enough. And having the stride would also allow to use polymorphic arrays... but we would lose C-compatibility. Perhaps the issue there.Abrupt
@Matthieu: The standard clause I cited prohibits new from prepending metadata for the non-array case.Ameeameer
@LucDanton: I didn't understand this "p then points to the Base subobject of the first element.". Will you please explain it more clearly? Thanks.Deipnosophist
H
16

It's wrong to treat an array-of-derived as an array-of-base, not only when deleting items. For example even just accessing the elements will usually cause disaster:

B *b = new D[10];
b[5].foo();

b[5] will use the size of B to calculate which memory location to access, and if B and D have different sizes, this will not lead to the intended results.

Just like a std::vector<D> can't be converted to a std::vector<B>, a pointer to D[] shouldn't be convertible to a B*, but for historic reasons it compiles anyway. If a std::vector would be used instead, it would produce a compile time error.

This is also explained in the C++ FAQ Lite answer on this topic.

So delete causes undefined behavior in this case because it's already wrong to treat an array in this way, even though the type system can't catch the error.

Hemophilia answered 30/5, 2011 at 3:26 Comment(0)
C
1

IMHO this has to do with limitation of arrays to deal with constructor/destructor. Note that, when new[] is called, compiler forces to instantiate only default constructor. In the same way when delete[] is called, compiler might look for only the destructor of calling pointer's static type.

Now in the case of virtual destructor, Derived class destructor should be called first followed by the Base class. Since for arrays compiler might see the static type of calling object (here Base) type, it might end up calling just Base destructor; which is UB.

Having said that, it's not necessarily UB for all compilers; say for example gcc calls destructor in proper order.

Crusty answered 30/5, 2011 at 3:2 Comment(0)
I
1

Just to add to the excellent answer of sth - I have written a short example to illustrate this issue with different offsets.

Note that if you comment out the m_c member of the Derived class, the delete operation will work well.

Cheers,

Guy.

#include <iostream>
using namespace std;

class Base 
{

    public:
        Base(int a, int b)
        : m_a(a)
        , m_b(b)    
        {
           cout << "Base::Base - setting m_a:" << m_a << " m_b:" << m_b << endl;
        }

        virtual ~Base()
        {
            cout << "Base::~Base" << endl;
        }

        protected:
            int m_a;
            int m_b;
};


class Derived : public Base
{
    public:
    Derived() 
    : Base(1, 2) , m_c(3)   
    {

    }

    virtual ~Derived()
    {
        cout << "Derived::Derived" << endl;
    }

    private:    
    int m_c;
};

int main(int argc, char** argv)
{
    // create an array of Derived object and point them with a Base pointer
    Base* pArr = new Derived [3];

    // now go ahead and delete the array using the "usual" delete notation for an array
    delete [] pArr;

    return 0;
}
Iwo answered 16/10, 2017 at 7:30 Comment(0)
J
0

I think it all comes down to the zero-overhead principle. i.e. the language doesn't allow storing information about the dynamic type of elements of the array.

Jilleen answered 30/5, 2011 at 2:57 Comment(2)
Actually, it is allowed. Standard section [expr.new] says "A new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array." This is intended to store the count of elements, but there's no particular prohibition on additionally storing some sort of type information, such as the element size.Ameeameer
Storing information about the types of each element of an array is not allowed. The standard allows implementers of the standard to request more than N*sizeof(T) bytes because some implementations store the array size as a part of the allocated memory. The array size has to be stored somewhere because the system has to know how many objects need to be destructed upon a call to delete[].Addend

© 2022 - 2024 — McMap. All rights reserved.