Why is there no reallocation functionality in C++ allocators?
Asked Answered
Q

5

37

In C the standard memory handling functions are malloc(), realloc() and free(). However, C++ stdlib allocators only parallel two of them: there is no reallocation function. Of course, it would not be possible to do exactly the same as realloc(), because simply copying memory is not appropriate for non-aggregate types. But would there be a problem with, say, this function:

bool reallocate (pointer ptr, size_type num_now, size_type num_requested);

where

  • ptr is previously allocated with the same allocator for num_now objects;
  • num_requested >= num_now;

and semantics as follows:

  • if allocator can expand given memory block at ptr from size for num_now objects to num_requested objects, it does so (leaving additional memory uninitialized) and returns true;
  • else it does nothing and returns false.

Granted, this is not very simple, but allocators, as I understand, are mostly meant for containers and containers' code is usually complicated already.

Given such a function, std::vector, say, could grow as follows (pseudocode):

if (allocator.reallocate (buffer, capacity, new_capacity))
  capacity = new_capacity;     // That's all we need to do
else
  ...   // Do the standard reallocation by using a different buffer,
        // copying data and freeing the current one

Allocators that are incapable of changing memory size altogether could just implement such a function by unconditional return false;.

Are there so few reallocation-capable allocator implementation that it wouldn't worth it to bother? Or are there some problems I overlooked?

Quelpart answered 23/6, 2010 at 19:49 Comment(3)
+1, this is a question that always bugged me.Anabel
Stroustrup's take on this thing: www2.research.att.com/~bs/bs_faq2.html#renew; it delegates the problem to the inner workings of vector, but doesn't say why isn't there a mechanism like "renew" to make the growing of the array simpler.Anabel
There is nothing stopping std::vector from doing that in some cases (e.g., it knows its using the standard allocator). The standard library is allowed to use knowledge of the underlying system.Practical
N
19

From: http://www.sgi.com/tech/stl/alloc.html

This is probably the most questionable design decision. It would have probably been a bit more useful to provide a version of reallocate that either changed the size of the existing object without copying or returned NULL. This would have made it directly useful for objects with copy constructors. It would also have avoided unnecessary copying in cases in which the original object had not been completely filled in.

Unfortunately, this would have prohibited use of realloc from the C library. This in turn would have added complexity to many allocator implementations, and would have made interaction with memory-debugging tools more difficult. Thus we decided against this alternative.

Nomenclator answered 23/6, 2010 at 21:41 Comment(8)
I think it's a shame they didn't at least add a reallocate method to Allocator's interface. It could have been implemented to just free an existing block and allocate a new one, but it would have given the potential to implement a new one later without redoing all the code using the Allocator interface.Udella
It's also a shame that C never bothered to add a "resize allocation if convenient" function, with a proviso that a conforming implementation would be allowed to always decide resizing was "inconvenient"Thaumaturgy
@Thaumaturgy How would the behavior of "resize allocation if convenient" be different from that of realloc()?Frontlet
@martinkunev: Pointers to the existing allocation would remain valid. This would be especially useful in scenarios where code would be able to compute an upper bound for the required buffer size, allocate that much space, write data to that buffer, and then--knowing how much of the buffer it would actually need to keep--release the part of the buffer it doesn't need.Thaumaturgy
@martinkunev: Additionally, if there realloc had an option to indicate whether the new size was likely to be the "final" size of the allocation, it could use that to select between best-fit and worst-fit allocations, with the latter offering an increased likelihood of being able to expand an allocation without having to relocate it.Thaumaturgy
@Thaumaturgy Some time ago I was wondering whether realloc() always preserves buffer address when shrinking (the answer is no) - this would have been useful for implementing weak pointer. Your idea goes a little farther and can help with performance optimization. I suppose when making the standard, they chose KISS.Frontlet
@martinkunev: Much of the Standard library came about as a consequence of people writing functions that fulfilled their immediate needs, others finding that the functions filled their needs as well, and those functions getting copied around among users of the language. A perennial problem with the C Standard is that the Committee has never articulated a consensus understanding of the extent to which the Standard seeks to be descriptive or prescriptive. Either approach, or even a combination, would be fine if the Standard made clear which portions should be read in which way.Thaumaturgy
@martinkunev: With regard to an address-preserving shrink, that may be useful for some purposes, but would forbid what might for some usage patterns would otherwise be useful performance enhancements (e.g. if realloc is passed a block that is preceded and followed by empty space, relocating the block to the start of that space may be helpful even if it doesn't grow). Unfortunately, free compiler maintainers focus on phony ub-based "optimizations" while making no effort to add library functions that could easily improve efficiency.Thaumaturgy
U
15

This is actually a design flaw that Alexandrescu points out with the standard allocators (not operator new[]/delete[] but what were originally the stl allocators used to implement std::vector, e.g.).

A realloc can occur significantly faster than a malloc, memcpy, and free. However, while the actual memory block can be resized, it can also move memory to a new location. In the latter case, if the memory block consists of non-PODs, all objects will need to be destroyed and copy-constructed after the realloc.

The main thing the standard library needs to accommodate this as a possibility is a reallocate function as part of the standard allocator's public interface. A class like std::vector could certainly use it even if the default implementation is to malloc the newly sized block and free the old one. It would need to be a function that is capable of destroying and copy-constructing the objects in memory though, it cannot treat the memory in an opaque fashion if it did this. There's a little complexity involved there and would require some more template work which may be why it was omitted from the standard library.

std::vector<...>::reserve is not sufficient: it addresses a different case where the size of the container can be anticipated. For truly variable-sized lists, a realloc solution could make contiguous containers like std::vector a lot faster, especially if it can deal with realloc cases where the memory block was successfully resized without being moved, in which case it can omit calling copy constructors and destructors for the objects in memory.

Udella answered 24/6, 2010 at 9:41 Comment(1)
std::vector could in theory specialize for trivially-copyable objects and use plain realloc, as long as new hasn't been replaced with a non-default version... Detecting that is probably only possible at link time, not compile time, so gcc/clang / etc. don't do this.Gastronomy
B
8

What you're asking for is essentially what vector::reserve does. Without move semantics for objects, there's no way to reallocate memory and move the objects around without doing a copy and destroy.

Bithia answered 23/6, 2010 at 20:5 Comment(8)
A good usecase for such functionality for this would be sparse containers. Using vectors, especially with preallocated memory, in them would completely defeat their purpose (sparseness is meant to conserve memory).Quelpart
@doublep: If you wanted sparse containers neither (dynamically) allocated arrays nor vectors are what you want.Morty
@Martin York: Well, for instance Google Sparsehash library uses dynamically allocated arrays and achieves very nice results.Quelpart
@doublep: I am sure it does use dynamically allocated arrays. But it does not use "A" dynamically allocated array as it is a bit more complex than that.Morty
@Mark Ransom: If you have a class similar to vector (using placement new/manual destruction, except memory allocated with malloc) would there be any problem calling realloc on the backing store? Or using new create a resized backing and memcpy the existing objects in. As long as you don't access both copies of your backing memory wouldn't the shallow copy be fine?Flagstone
@Martin York: I don't understand what you said at all.Quelpart
I don't think it's anything like vector::reserve, unless of course you are willing to live in the circular world where allocators are implemented on top of vector which is in turn implemented on top of allocators... Yes, to the end user, vector reserve does something a bit like realloc: the vector's capacity grows by a certain amount. Of course, internally, vector is simply implemented on top of the allocator functions and can only allocate a whole new block and free a complete block: it cannot ask for the existing block to be expanded.Iives
So vector is stuck copying over all its elements any time it wants to increase capacity, versus the optimistic case where asking to expand an existing block was successful and no copy was needed. In the case of filling up a vector, this can avoid a ton of copying and fragmentation. Many other containers could use it. So you can't really compare vector::resize and a hypothetical allocator::reallocate since the latter is a tool used to implement the later.Iives
G
2

I guess this is one of the things where god went wrong, but I was just too lazy to write to the standards committee.

There should have been a realloc for array allocations:

p = renew(p) [128];

or something like that.

Giselagiselbert answered 23/6, 2010 at 19:55 Comment(3)
If you use vectors instead of arrays, there's .reserve(). Why add new functionality for arrays when vectors are generally better?Basin
@DavidThornley, vectors under the hood have to go through the allocator interface. So it seems a vector cannot free unused memory as efficiently as possible. (But I think/hope I'm missing something here!)Stork
@DavidThornley - exactly what Aaron said. Everyone says "use vector" - but they are talking at the wrong level of abstraction! vector itself has to be built on (and indeed is built on) the lower level allocation routines that let you get uninitialized memory and so on. If those routines don't offer a low-level "reallocate" function, certainly vector can't either. Of course, it can offer resize and reserver and everything else, but under the covers those are just allocating new blocks and copying things over. Nothing like expanding an existing block.Iives
I
2

Because of the object oriented nature of C++, and the inclusion of the various standard container types, I think it's simply that less focus was placed on direction memory management than in C. I agree that there are cases that a realloc() would be useful, but the pressure to remedy this is minimal, as almost all of the resulting functionality can be gained by using containers instead.

Idiographic answered 23/6, 2010 at 20:0 Comment(4)
I'm not sure I agree they thought about it less because of OOP. Placement new is an example of a feature specifically for memory management of objects.Swound
I'm not saying they thought about it less, just that the actual syntax was designed to place more emphasis on OOP than on direct memory management in cases where the two are different options. Placement new is a perfect example of where the two are used together by the programmer rather than one replacing the other.Idiographic
Actualy placement new is the apparatus that enables otherwise very hard custom memory management (specifically memory pools) within c++ facilities. I think of it as a device that encourages customizing memory management with c++ code libraries.Pyonephritis
You can't use containers to get the functionality of realloc (at least if you include performance in "functionality") since the containers are implemented on top of the low level memory allocation routines such as std::allocator which ultimately call even lower level C++ library methods such as operator new(size t), which then in turn (usually) call C-level malloc or whatever. Since the lower level C++ abstractions for allocation don't include reallocation, the higher-level containers aren't going be able to synthesize it from thin air.Iives

© 2022 - 2024 — McMap. All rights reserved.