Missed Optimization: std::vector<T>::pop_back() not qualifying destructor call?

Asked 6/8, 2022 at 6:48 Answered 18/8, 2022 at 15:20

In an std::vector<T> the vector owns the allocated storage and it constructs Ts and destructs Ts. Regardless of T's class hierarchy, std::vector<T> knows that it has only created a T and thus when .pop_back() is called it only has to destroy a T (not some derived class of T). Take the following code:

#include <vector>

struct Bar {
    virtual ~Bar() noexcept = default;
};

struct FooOpen : Bar {
    int a;
};

struct FooFinal final : Bar {
    int a;
};

void popEm(std::vector<FooOpen>& v) {
    v.pop_back();
}

void popEm(std::vector<FooFinal>& v) {
    v.pop_back();
}

https://godbolt.org/z/G5ceGe6rq

The PopEm for FooFinal simply just reduces the vector's size by 1 (element). This makes sense. But PopEm for FooOpen calls the virtual destructor that the class got by extending Bar. Given that FooOpen is not final, if a normal delete fooOpen was called on a FooOpen* pointer, it would need to do the virtual destructor, but in the case of std::vector it knows that it only made a FooOpen and no derived class of it was constructed. Therefore, couldn't std::vector<FooOpen> treat the class as final and omit the call to the virtual destructor on the pop_back()?

Gurl answered 6/8, 2022 at 6:48 Comment(11)

No the compiler doen'st know the vector only will contain FooOpen's. Maybe it will be linking with some other component later that does insert a class derived from FooOpen. So your assumption is only valid for this snippet of code. With FooFinal the optimization can be done. – Saxena 6/8, 2022 at 7:27

@PepijnKramer How can it insert a class derived from FooOpen? The only possibility I can see is that a user may placement-new a derived object into the storage of a FooOpen element, which would at the very least depend on a lot of unspecified behavior, but I feel like should be library undefined behavior in the first place. – Coquet 6/8, 2022 at 7:35

@PepijnKramer I think the point is that even if we have a million classes that inherit fooopen, none of them can ever be stored in the vector. The vector always constructs and destructs fooopens, nothing else. To make a guess at the question: the vector knows that this optimization could be done, but that doesn't mean the compiler knows it. Some complex code analysis would have to be done for the compiler to figure this out. I don't have much knowledge about the optimization techniques in use, but I imagine you'd need some special treatment for vector to make this happen. – Kidd 6/8, 2022 at 7:39

@Kidd No complex analysis would be required. The specification says that std::allocator_traits::destroy is used to destroy the element, which for std::allocator as allocator simply means a (unqualified) destructor call. The standard library can detect and special case the container if std::allocator is used (they already do that for optimization of trivially copyable types) and then always use a qualified destructor call instead of allocator_traits::destroy, which will enforce a static dispatch even if the class has a virtual destructor. – Coquet 6/8, 2022 at 7:43

Are you sure this is anything to do with std::vector? What happens if you simply do { volatile FooOpen x; }? – Jobber 6/8, 2022 at 8:53

@Kidd You're absolutely right, I missed that completely. – Saxena 6/8, 2022 at 10:27

I was wondering if other things would have the same problem, for example: ``` void rst(std::optional<FooOpen>& o) { o.reset(); } void rst(std::optional<FooFinal>& o) { o.reset(); } ``` But that generates the same machine code on compiler explorer. So std::optional knows that it is just a FooOpen and doesn't bother calling the virtual dtor. Overall, I think in the vector pop code, it needs to do item->FooOpen::~FooOpen() instead of item->~FooOpen() and thus it can call the more optimized dtor. – Gurl 6/8, 2022 at 11:2

There is no need to treat FooOpen as final, since a std::vector<FooOpen> can never contain an instance of a class derived from FooOpen - any add/copy of an instance of a derived class into the vector (e.g. by push_back()) will slice the object. Destroying each element will therefore call destructors of FooOpen then its bases (in that order). It doesn't even matter if the base destructor is virtual (since no object is destroyed via a pointer to base). How the compiler optimises those destructor calls (when it has visibility of them) is then a quality of implementation concern – Cynth 6/8, 2022 at 11:12

@UriRaz No, if you try to insert a derived class into the vector, you will just store a sliced copy of it and consequently also only FooOpen's destructor will be called. That is the whole point. It is impossible to store any other type but exactly FooOpen in the vector. The vector interface simply does not allow anything else. – Coquet 6/8, 2022 at 13:45

@HarrisonMetzger What looks strange is msvc and clang show similar results. I checked libstdc++ code but couldn't find any clue. I wrote my own vector imitation and it looks the same. So it's either some missed optimisation occurring in 3 main compilers or the standard prevents it for some reason. Anyway it's good to know final can make such a difference. – Harangue 15/8, 2022 at 21:47

Don't be shy about reporting this to gcc's bugzilla and opening an issue in llvm's github. – Decency 18/8, 2022 at 19:54

Long story short - compiler doesn't have enough context information to deduce it https://godbolt.org/z/roq7sYdvT

Boring part:

The results are similar for all 3: msvc, clang, and gcc, so I guess the problem is general. I analysed the libstdc++ code just to find pop_back() runs like this:

void pop_back() // a bit more convoluted but boils-down to this
{
    --back;
    back->~T();
}

Not any surprise. It's like in C++ textbooks. But it shows the problem - virtual call to a destructor from a pointer. What we're looking for is the 'devirtualisation' technique described here: Is final used for optimisation in C++ - it states devirtualisation is 'as-if' behaviour, so it looks like it is open for optimisation if the compiler has enough information to do it.

My opinion:

I meddled with the code a little and i think optimisation doesn't happen because the compiler cannot deduce the only objects pointed by "back" are FooOpen instances. We - humans - know it because we analyse the entire class, and see the overall concept of storing the elements in a vector. We know the pointer must point to FooOpen instance only, but compiler fails to see it - it only sees a pointer which can point anywhere (vector allocates uninitialized chunk of memory and its interpretation is a part of vector's logic, also the pointer is modified outside the scope of pop_back()). Without knowing the entire concept of vector<> i don't think of how it can be deduced (without analysing the entire class) that it won't point to any descendant of FooOpen which can be defined in other translation units.

FooFinal doesn't have this problem because it already guarantees no other class can inherit from it so devirtualisation is safe for objects pointed by FooFinal* or FooFinal&.

Update I made several findings which may be useful:

https://godbolt.org/z/3a1bvax4o), devirtualisation can occur for non-final classes as long as there is no pointer arithmetic involved.
https://godbolt.org/z/xTdshfK7v std::array performs devirtualisation on non-final classes. std::vector fails to do it even if it is constructed and destroyed in the same scope.
https://godbolt.org/z/GvoaKc9Kz devirtualisation can be enabled using wrapper.
https://godbolt.org/z/bTosvG658 destructor devirtualisation can be enabled with allocator. Bit hacky, but is transparent to the user. Briefly tested.

Harangue answered 16/8, 2022 at 0:14 Comment(1)

Good analysis. This seems a bit silly, because at this point T is a complete type, so in case of FooFinal optimisation is applied, but on the other hand, even though the whole implementation of std::vector is also visible to the compiler, it does not understand that only objects of type T (and not derived from T) are constructed. – Tine 18/8, 2022 at 10:5

Yes, this is a missed optimisation.

Remember that a compiler is a software project, where features have to be written to exist. It may be that the relative overhead of virtual destruction in cases like this is low enough that adding this in hasn't been a priority for the gcc team so far.

It is an open-source project, so you could submit a patch that adds this in.

Dian answered 15/8, 2022 at 11:3 Comment(7)

Additionally the low priority can also be that they don't see enough use-cases to justify it. Having an object with a virtual destructor directly in a vector seems like a code smell - I would normally expect some kind of pointer-wrapper in that case, or that the destructor shouldn't be virtual. Obviously there can be exceptions, but perhaps not sufficiently many. – Monroy 18/8, 2022 at 9:4

@HansOlsson It's a smell to have the bottom of your inheritance hierarchy in a vector, but every class in that hierarchy has a virtual destructor. Unless you want to zealously enforce marking as much as possible final it seems fine for vector<FooOpen> to exist – Dian 18/8, 2022 at 9:10

@Dian Would you care to explain why exactly it is a smell to have the bottom of hierarchy in a vector? – Tine 18/8, 2022 at 10:8

@KrzysiekKarbowiak for the reason Hans suggests: you have something intended to be polymorphic, inserting into the vector will slice. – Dian 18/8, 2022 at 10:22

@Dian If it is bottom of hierarchy it cannot slice. And in case of slicing, I would call it a serious programmer error, not a code-smell. – Tine 18/8, 2022 at 10:35

@KrzysiekKarbowiak by bottom I mean the base class, not the most-derived – Dian 18/8, 2022 at 10:54

@Dian Doh. In that case I agree. – Tine 18/8, 2022 at 11:48

It feels a lot like § 11.4.7 (14) gives some insight into this. As of latest working draft (N4910 Post-Winter 2022 C++ working draft, Mar. 2022):

After executing the body of the destructor and destroying any objects with automatic storage duration allocated within the body, a destructor for class X calls the destructors for X’s direct non-variant non-static data members, the destructors for X’s non-virtual direct base classes and, if X is the most derived class (11.9.3), its destructor calls the destructors for X’s virtual base classes. All destructors are called as if they were referenced with a qualified name, that is, ignoring any possible virtual overriding destructors in more derived classes. Bases and members are destroyed in the reverse order of the completion of their constructor (see 11.9.3). [Note 4 : A return statement (8.7.4) in a destructor might not directly return to the caller; before transferring control to the caller, the destructors for the members and bases are called. — end note] Destructors for elements of an array are called in reverse order of their construction (see 11.9).

Also interesting for this topic, § 11.4.6, (17):

In an explicit destructor call, the destructor is specified by a ~ followed by a type-name or decltype-specifier that denotes the destructor’s class type. The invocation of a destructor is subject to the usual rules for member functions (11.4.2); that is, if the object is not of the destructor’s class type and not of a class derived from the destructor’s class type (including when the destructor is invoked via a null pointer value), the program has undefined behavior.

So, as far as the standard cares, the invocation of a destructor is subject to the usual rules for member functions.

This, to me, sounds a lot like destructor calls do so much that compilers are likely unable to determine, at compile-time, that a destructor call does "nothing" - as it also calls destructors of members, and std::vector doesn't know this.

Fremantle answered 18/8, 2022 at 15:20 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags