Why does the delete[] syntax exist in C++?
Asked Answered
C

7

127

Every time somebody asks a question about delete[] on here, there is always a pretty general "that's how C++ does it, use delete[]" kind of response. Coming from a vanilla C background what I don't understand is why there needs to be a different invocation at all.

With malloc()/free() your options are to get a pointer to a contiguous block of memory and to free a block of contiguous memory. Something in implementation land comes along and knows what size the block you allocated was based on the base address, for when you have to free it.

There is no function free_array(). I've seen some crazy theories on other questions tangentially related to this, such as calling delete ptr will only free the top of the array, not the whole array. Or the more correct, it is not defined by the implementation. And sure... if this was the first version of C++ and you made a weird design choice that makes sense. But why with $PRESENT_YEAR's standard of C++ has it not been overloaded???

It seems to be the only extra bit that C++ adds is going through the array and calling destructors, and I think maybe this is the crux of it, and it literally is using a separate function to save us a single runtime length lookup, or nullptr at end of the list in exchange for torturing every new C++ programmer or programmer who had a fuzzy day and forgot that there is a different reserve word.

Can someone please clarify once and for all if there is a reason besides "that's what the standard says and nobody questions it"?

Collbaith answered 17/5, 2021 at 23:34 Comment(2)
If you want to test your memory allocation and freeing to see if those crazy theories are correct or not, you can use Valgrind to see what is actually going on. I suspect that overloading delete has more issues than described so far in answers, but I don't have the expertise.Meld
Related question: How does delete[] know it's an array?, and particularly note this answer.Blockus
C
173

Objects in C++ often have destructors that need to run at the end of their lifetime. delete[] makes sure the destructors of each element of the array are called. But doing this has unspecified overhead, while delete does not. This is why there are two forms of delete expressions. One for arrays, which pays the overhead and one for single objects which does not.

In order to only have one version, an implementation would need a mechanism for tracking extra information about every pointer. But one of the founding principles of C++ is that the user shouldn't be forced to pay a cost that they don't absolutely have to.

Always delete what you new and always delete[] what you new[]. But in modern C++, new and new[] are generally not used anymore. Use std::make_unique, std::make_shared, std::vector or other more expressive and safer alternatives.

Cletus answered 17/5, 2021 at 23:38 Comment(24)
Wow that was a quick response, thanks for the hints on allocation functions. It is surprising how often the answer in C++ is "don't use that keyword", use std::someWeirdFunctionIntroducedInC++>=11()Collbaith
@Collbaith C++ gives you the tools to work as close to the hardware as it can. But those tools are usually powerful and blunt, making them dangerous and hard to use effectively. So it also provides tools via the standard library that are slightly farther removed from the hardware, but very safe and easy. This is why you learn about so many features, but are told not to use them. Because unless you are doing something very unique or strange, those low level tools are not useful. The more convenient features are generally just fine.Laidlaw
@Collbaith You are correct that most of the time, if there is a neat standard library feature to replaces a built-in mechanism, it comes from C++11 or later. C++11 basically revolutionized the language, allowing for standard library features that were previously not possible to implement. The difference between C++11 and previous versions is so significant that they can basically be thought of as two different languages. Beware, when learning C++, to distinguish between education material targeting C++03 and earlier from material targeting C++11 and later.Laidlaw
@awiebe, also note that the existence of lower level mechanisms like new allows most of the standard library (and other libraries) to be written in pure C++ (some parts may need compiler support). So the advice could also be "only use these to build higher level abstractions".Michaelmas
This answer doesn't seem right. Both delete and delete[] call destructors.Major
@Major I reworded it to make it clearer that delete[] potentially has extra overhead related to destroying each array element.Laidlaw
Avoiding a trivial loop in the one-element case seems like a curious microoptimization, but maybe it made sense in the early days, when computers were slower and C++ was closer in spirit to C.Major
@Major I think the concern was more about memory overhead rather than speed. The problem is made more apparent with placement new, where portable placement new of an array is currently not possible due to this potential overhead. See Can placement new for arrays be used in a portable way?. You may need to pass more storage than the elements would need, but it is not possible to ask the implementation how much extra storage it will need.Laidlaw
@FrançoisAndrieux: Choice-of-words-nitpick "...those tools are usually powerful and blunt...": I actually see them as super sharp, surgical tools: You can get exactly what you want, how you want it. But stitching up or cleaning up surgical procedures requires equivalent skills and materials, a band aid won't do.Churchman
@WillemvanRumpt You're probably right, the surgery analogy is probably better. I chose blunt because you can do a job with a blunted tool, but you'll probably need to put a lot more effort into the task.Laidlaw
It's not just necessary for destructors, but also for freeing memory on systems where the size of the memory block is not known to the allocator and must be specified when freeing. For example, on AmigaOS it is legal to do char *p = AllocMem(64); FreeMem(p + 16, 32);, this leaves you with two allocations of 16 bytes each, at p and p+48. On AmigaOS, delete o; calls FreeMem(o, sizeof *o);, so using non-array deletion on an array will only free the first element.Monopoly
The fact that delete pointer and delete [] pointer use different deallocation-functions has nothing to do with the first having to deal with potentially polymorphic delete or the second having to recover the number of elements to delete. By the time you get to the deallocation-function, that is already all done with. I'm not aware of any rational reason for them using a separate set of allocation- and deallocation-functions.Roberson
@Roberson I'm not* sure enough about why there are different deallocation functions for delete and delete[], it isn't clear to me if it would be impossible to have a single version that could handle both cases without overhead. I'll take that part of the question back out of my answer. The part about destructors should be enough to justify the difference between delete and delete[]. Edit : forgot not*Laidlaw
@Barmar: You're right about the “curious microoptimization”. AFAICT, it would be totally valid standards-wise for a C++ compiler to implement new X the same as new X[1], and delete the same as delete[], thus forgiving inadvertent use of the wrong delete operator. But that means “wasting” memory storing the array length.Swab
@Swab The standard requires that a non-array new expression creating a T (new T;) is required to request exactly sizeof(T) bytes from the allocation function. Because of this rule, you can only implement new T; as new T[1]; if new expression with an array type also has no array allocation overhead on that platform. See : "That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array".Laidlaw
@Roberson On second thought, there is another constraint. Because it is possible to replace the operator new separately from operator new[] even that may not be possible.Laidlaw
@FrançoisAndrieux Yes. Having those two separate functions is very inconvenient there.Roberson
Yeah, once they made that original design decision, it had ramifications that rippled through the spec.Major
Do note that sometimes you have to pair new T() with delete[]. "scalar-looking new" is perfectly capable of performing array allocation, if the actual type is an array.Stalemate
The answer is often "use some weird, probably-from-C++11 function", @awiebe, because a noticeable portion of the standard library is dedicated to providing wrappers around core language features to let the compiler do the heavy lifting for you whenever you don't need to optimise something to the absolute limit. It trades a potential slight drop in speed and/or increase in executable size for readable code, lessened bookkeeping on your end, and better guarantees of safety. For example, std::vector is essentially just a pretty wrapper that manages new[] and delete[] for you, with extras.Myo
@FrançoisAndrieux: C++ gives you the tools to work as close to the hardware as it can. - except when the committee decided not to. Note that C++ new/delete lack any kind of C realloc mechanism (or try_realloc for non-trivially-copyable) that can extend an existing allocation to avoid copying when possible. Most real OSes can do that, and if not a trivial implementation is just new/copy/delete, or return false for try_realloc. This makes a large std::vector waste a lot of time, memory bandwidth, and page faults, copying stuff around as it grows, if you don't .reserve ahead of time.Ten
Bjarne long ago said [] was a mistake.Overskirt
@BenVoigt Is there a T-shirt "Keep C++ weird"?Pustulate
@philipxy: Citation needed.Stalemate
F
41

Basically, malloc and free allocate memory, and new and delete create and destroy objects. So you have to know what the objects are.

To elaborate on the unspecified overhead François Andrieux's answer mentions, you can see my answer on this question in which I examined what does a specific implementation do (Visual C++ 2013, 32-bit). Other implementations may or may not do a similar thing.

In case the new[] was used with an array of objects with a non-trivial destructor, what it did was allocating 4 bytes more, and returning the pointer shifted by 4 bytes ahead, so when delete[] wants to know how many objects are there, it takes the pointer, shifts it 4 bytes prior, and takes the number at that address and treats it as the number of objects stored there. It then calls a destructor on each object (the size of the object is known from the type of the pointer passed). Then, in order to release the exact address, it passes the address that was 4 bytes prior to the passed address.

On this implementation, passing an array allocated with new[] to a regular delete results in calling a single destructor, of the first element, followed by passing the wrong address to the deallocation function, corrupting the heap. Don't do it!

Frae answered 18/5, 2021 at 12:33 Comment(0)
K
34

Something not mentioned in the other (all good) answers is that the root cause of this is that arrays - inherited from C - have never been a "first-class" thing in C++.

They have primitive C semantics and do not have C++ semantics, and therefore C++ compiler and runtime support, which would let you or the compiler runtime systems do useful things with pointers to them.

In fact, they're so unsupported by C++ that a pointer to an array of things looks just like a pointer to a single thing. That, in particular, would not happen if arrays were proper parts of the language - even as part of a library, like string or vector.

This wart on the C++ language happened because of this heritage from C. And it remains part of the language - even though we now have std::array for fixed-length arrays and (have always had) std::vector for variable-length arrays - largely for purposes of compatibility: Being able to call out from C++ to operating system APIs and to libraries written in other languages using C-language interop.

And ... because there are truckloads of books and websites and classrooms out there teaching arrays very early in their C++ pedagogy, because of a) being able to write useful/interesting examples early on that do in fact call OS APIs, and of course because of the awesome power of b) "that's the way we've always done it".

Klos answered 18/5, 2021 at 15:2 Comment(20)
This answer makes a number of utterly incorrect claims, evidently based on not knowing that both C and C++ support "pointer-to-array" types. It's not lack of ability to express a pointer to an array, it's disuse of that ability in practice.Stalemate
pointer-to-array decays instantly to pointer-to-element, though, and that's the way it's used., no? How many C++ (or C) function/method signatures take a pointer-to-array type? Nobody, but nobody, teaches that, nor is that how it is used. Do you disagree? E.g., show me where in Unix/Linux APIs a pointer-to-array is used in a function signature over a naked pointer assumed by documentation to be an array? @BenVoigtKlos
@BenVoigt - Neither The C Programming Language (Kernighan, Ritchie, 1978) nor C: A Reference Manual - 3rd ed (Harbison, Steele, 1991) ("including both ANSI and traditional C") discuss pointer-to-array - they only discuss how "array of T" and "pointer to T" are nearly equivalent. "When an array identifier first appears in an expression, the type of the identifier is converted from "an array of T" to "pointer to T" and the value of the identifier is converted to a pointer to the first element of the array." (Harbison, pg 111). And many other similar references.Klos
Design and Evolution of C++ (Stroustrup, 1994) not only does not mention pointer-to-array it echoes Harbison/Steele's wording.Klos
Neither Effective C++ - 3rd ed (Meyers, 2008) nor More Effective C++ (Meyers, 1996) mention pointer-to-array types. I could go on with books from my library, but ... I don't really care to. The point is not whether at some time - even originally - the languages, technically, had this ability. The point is that nobody has ever used it. Ever. The fact that I didn't mention it in my answer doesn't mean I didn't know of it. Just that I know it is a useless vestige of a compiler writer's store of knowledge. It has never been in use, never taught.Klos
The core issue here is that pointer-to-array and reference-to-array types are really hard to read, so people got into a habit of not using them, which leads to knowledge falling to the wayside. Easiest way to work with them is with templates or decltype, and using them normally quickly devolves into a nigh-unreadable mess. create() here is bad enough (in multiple ways), just imagine a function that takes pointers to two arrays and returns a pointer to a different kind of array.Myo
Since a common uses of new[] is allocate an array of size unknown at compile time, C's pointer-to-array doesn't really help much anyway.Conceptualism
@JackAidley: It does for 2D arrays, because C99 also supports VLAs. A different way to malloc a 2D array? shows that int (*p)[2] = malloc(3 * sizeof *p); is the C way to dynamically allocate an int [3][2] array-of-arrays object, with a local pointing to it. The 2 and 3 could both be runtime variables in C, unlike C++.Ten
@davidbak: Indeed, it's a choice between having to use p[INDEX2D(i,j)] to emulate 2D indexing in a flat array, or using p[i][j] and having the compiler scale i by other array dimension for you, based on the pointer-to-VLA type. Either one is fine, as long as you avoid int **p with separate allocations for each row. (nvm, I see you deleted the comment I was replying to. I can delete this, one, too.)Ten
@PeterCordes - w.r.t. this particular question though, VLAs are C only and were only standardized in C99 - long past C++'s creation. Wikipedia on VLAs is sadly deficient on when GCC first supported them - but it is doubtful Stroustrup would have used them as a model at all, being non-standard, since his first C++ implementation, cfront, was a translator to C, and he probably had ideas even then of not tying it to a specific C compiler.Klos
@PeterCordes - I deleted my comment because ... your point was a valid response to an existing comment ... though I myself had doubts about readability of the construct.Klos
Yeah, Jack was replying to @JustinTime, who pointed out that the pointer-to-array syntax is pretty hard to read. No argument from me, and usually not worth bothering with. The continued lack of ISO C++ support for VLAs means it's often not useful to use pointers-to-arrays in portable C++; std::array is just as powerful and cleaner. (GNU C++ does support VLAs in C++.) But C++ most certainly can have pointers to arrays, so I agree with Ben's initial comment. The only kind of array it supports at all has compile-time-constant size, but those can be pointed-to.Ten
@petercordes C99 turned up a bit late for C++Conceptualism
@JackAidley: Right, the initial design of C++ was constrained by wanting to compile to portable C at the time, without needing to rely on VLA features that some but not all C compilers at the time supported. That alone doesn't explain why modern C++ still lacks it (the reasons perhaps being related to complications of destructors). I thought we were just talking about why pointer-to-array syntax is not widely used (in C or C++), and I pointed out that there are C use-cases for it. I'm not trying to argue anything directly about the design of C++ (except that it does support pointer-to-array)Ten
@PeterCordes - I guess the reason modern C++ still lacks it is because it's not worth the effort to perform any surgery on the language - major or minor - at this late date in order to make C-style arrays behave better. Much more bang-for-the-duck to just give us an arbitrary-rank operator[] and then anyone can use the built-in proven class and template facilities of C++ to create any damn kind of array they want. Then, operators new and delete - both scalar and array versions - will only ever be seen encapsulated and hidden in one or two methods of low-level classes requiring heap.Klos
VLA syntax is a lot cleaner than alloca (which is also non-standard IIRC) when you want a small local temporary array in automatic storage, with runtime variable size that you know is small enough. (i.e. on the stack in a normal C++ implementation). For dynamic allocation, sure you can just wrap new[] in a class with overloaded operator[]. It feels like a lot of wasted work to write an array-like wrapper for alloca when GNU C++ supports alignas(32) float foo[n]; for a small amount of scratch space.Ten
@PeterCordes - interesting use case, I've certainly used alloca() in the past - but I wonder if that's going to be a good thing - or even an acceptable thing - going forward now that we have coroutines ... (including VLA in that)Klos
From en.cppreference.com/w/cpp/language/coroutines, it looks like the design is compatible with being called from functions that use VLA / alloca. The storage space for coroutine state is heap allocated with new, or it can be optimized out (using stack space) if the lifetime permits, and "if the size of coroutine frame is known at the call site". Compilers may well forbid alloca / VLA inside coroutine functions (they probably need the stack-frame size to be fixed for the duration of the coroutine), but the standard provides a definite way to tell when a function is a coroutine.Ten
There are reasons why C++ didn't adopt it, but they're not relevant to the choice of delete[] vs. delete. That was made a long time ago. For what it's worth Bjarne acknowledged it as a mistake, IIRC.Conceptualism
@Klos Obviously you don't know of it, because you mistakenly believe that pointer-to-array decays to pointer-to-the-first-element. It does not. Array decays to a pointer to the initial element, pointer-to-array is interconvertible with a pointer to the first element but does not decay. And it is very bad for any textbook to not cover pointer-to-array, because contrary to your "It has never been in use", it actually is quite common. A two-dimensional array decays to a pointer-to-array. Indexing into a two-dimensional array does pointer arithmetic on a pointer-to-array.Stalemate
W
7

Generally, C++ compilers and their associated runtimes build on top of the platform's C runtime. In particular in this case the C memory manager.

The C memory manager allows you to free a block of memory without knowing its size, but there is no standard way to get the size of the block from the runtime and there is no guarantee that the block that was actually allocated is exactly the size you requested. It may well be larger.

Thus the block size stored by the C memory manager can't usefully be used to enable higher-level functionality. If higher-level functionality needs information on the size of the allocation then it must store it itself. (And C++ delete[] does need this for types with destructors, to run them for every element.)

C++ also has an attitude of "you only pay for what you use", storing an extra length field for every allocation (separate from the underlying allocator's bookkeeping) would not fit well with this attitude.

Since the normal way to represent an array of unknown (at compile time) size in C and C++ is with a pointer to its first element, there is no way the compiler can distinguish between a single object allocation and an array allocation based on the type system. So it leaves it up to the programmer to distinguish.

Wherefore answered 19/5, 2021 at 17:52 Comment(0)
G
4

The cover story is that delete is required because of C++'s relationship with C.

The new operator can make a dynamically allocated object of almost any object type.

But, due to the C heritage, a pointer to an object type is ambiguous between two abstractions:

  • being the location of a single object, and
  • being the base of a dynamic array.

The delete versus delete[] situation just follows from that.

However, that's does not ring true, because, in spite of the above observations being true, a single delete operator could be used. It does not logically follow that two operators are required.

Here is informal proof. The new T operator invocation (single object case) could implicitly behave as if it were new T[1]. So that is to say, every new could always allocate an array. When no array syntax is mentioned, it could be implicit that an array of [1] will be allocated. Then, there would just have to exist a single delete which behaves like today's delete[].

Why isn't that design followed?

I think it boils down to the usual: it's a goat that was sacrificed to the gods of efficiency. When you allocate an array with new [], extra storage is allocated for meta-data to keep track of the number of elements, so that delete [] can know how many elements need to be iterated for destruction. When you allocate a single object with new, no such meta-data is required. The object can be constructed directly in the memory which comes from the underlying allocator without any extra header.

It's a part of "don't pay for what you don't use" in terms of run-time costs. If you're allocating single objects, you don't have to "pay" for any representational overhead in those objects to deal with the possibility that any dynamic object referenced by pointer might be an array. However, you are burdened with the responsibility of encoding that information in the way you allocate the object with the array new and subsequently delete it.

Grotesque answered 20/5, 2021 at 23:35 Comment(0)
B
2

An example might help. When you allocate a C-style array of objects, those objects may have their own destructor that needs to be called. The delete operator does not do that. It works on container objects, but not C-style arrays. You need delete[] for them.

Here is an example:

#include <iostream>
#include <stdlib.h>
#include <string>

using std::cerr;
using std::cout;
using std::endl;

class silly_string : private std::string {
  public:
    silly_string(const char* const s) :
      std::string(s) {}
    ~silly_string() {
      cout.flush();
      cerr << "Deleting \"" << *this << "\"."
           << endl;
      // The destructor of the base class is now implicitly invoked.
    }

  friend std::ostream& operator<< ( std::ostream&, const silly_string& );
};

std::ostream& operator<< ( std::ostream& out, const silly_string& s )
{
  return out << static_cast<const std::string>(s);
}

int main()
{
  constexpr size_t nwords = 2;
  silly_string *const words = new silly_string[nwords]{
    "hello,",
    "world!" };

  cout << words[0] << ' '
       << words[1] << '\n';

  delete[] words;

  return EXIT_SUCCESS;
}

That test program explicitly instruments the destructor calls. It’s obviously a contrived example. For one thing, a program does not need to free memory immediately before it terminates and releases all its resources. But it does demonstrate what happens and in what order.

Some compilers, such as clang++, are smart enough to warn you if you leave out the [] in delete[] words;, but if you force it to compile the buggy code anyway, you get heap corruption.

Book answered 19/5, 2021 at 19:42 Comment(0)
S
1

Delete is an operator that destroys array and non-array(pointer) objects which are generated by new expression.

It can be used by either using the Delete operator or Delete [ ] operator A new operator is used for dynamic memory allocation which puts variables on heap memory. This means the Delete operator deallocates memory from the heap. Pointer to object is not destroyed, value or memory block pointed by the pointer is destroyed. The delete operator has a void return type that does not return a value.

Scooter answered 14/6, 2021 at 4:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.