Is it fair to always recommend std::vector over realloc?
Asked Answered
C

4

12

From Bjarne Stroustrup's FAQ:

If you feel the need for realloc() - and many do - then consider using a standard library vector.

I'll preface my question by agreeing that std::vector is better for many reasons, and I personally would always choose to use it over writing my own dynamic arrays with C memory allocation.

But, std::vector fragments memory as it grows because C++ has no equivalent of realloc (edit To clarify, I know that std::vector's storage is contiguous and won't get fragmented, I mean fragmentation of the memory space caused by allocating and deallocating, which realloc can avoid by extending an existing allocation). So is it fair to always recommend it over realloc? With great care, couldn't you write something that works just like std::vector but using C allocation functions, which has the possibility to grow its memory without moving its address and copying existing elements, making it as good or better in terms of fragmentation and performance?

And relatedly (bonus question!), why doesn't C++ have an equivalent to realloc? It seems like an odd thing to omit in a language that is so focused on performance. The section in Bjarne's FAQ has exactly that title (minus emphasis), but the answer doesn't address the 'why'. Was it just an accidental omission? Is there some fundamental incompatibility with how new/delete work? Does it not really give the benefits it seems to in practice?

Edit: ok, so I had neglected to consider the C nastiness of realloc - std::vector can't be rewritten using realloc because it only works with PODs, doesn't throw and so on. Perhaps a POD-only container written to deal with the nastiness would be a good idea for some situations. In any case though, the more interesting question becomes: would std::vector benefit from a C++ equivalent of realloc, which has (more or less) been answered here:

Does std::vector *have* to move objects when growing capacity? Or, can allocators "reallocate"?

Sadly, the answer seems to be "yes, but the standards committee didn't vote it in". Here's hoping.

Cocke answered 8/4, 2014 at 9:31 Comment(13)
Source that confirms that std::vector fragments memory? If that was the case, returning the data pointer wouldn't be safe.Pigmy
why do you think std::vector resize fragments more than realloc? If you used realloc for doing a c-style vector, std::vector will work the same way, and possibly better because it may be implemented with a clever growth strategy which minimizes fragmentation.Skulk
A partial answer to your question is in Does std::vector have to move objects when growing capacity? Or, can allocators “reallocate”?Modular
realloc can fragment memory as much as std::vector.Skulk
By 'fragments memory' I mean the rest of the address space, not its own memory - any time it allocates a new block and deallocates the old block, it's contributing to fragmentation. realloc has the ability to expand an existing block though, which is surely as good or better than that?Cocke
@Ben: In this context, "fragmentation" means leaving gaps in the heap by allocating one block then freeing another, which can reduce the amount of usable blocks available from the heap. Of course, the memory used by the vector is a single contiguous block, since that's required by its specification.Workbench
Have you tried to search SO for existing answers to your question? There are several. If they don't satisfy you, cite one or two and explain why.Reexamine
@n.m. Yes, none of them are quite the same. I can add some links if that's helpful.Cocke
Oh, then in that case I agree with galinette. I don't see why realloc would do a better job.Pigmy
@Skulk Yes, it can, but it also might not, that's my point. It's no worse than std::vector but might be better.Cocke
@BenHymers: "no worse than std::vector" - apart from only working for trivial types, and not giving you any control over the allocation strategy.Workbench
You may be interested in github.com/facebook/folly/blob/master/folly/docs/FBVector.mdKyongkyoto
@Modular Excellent foresight! I saw that question but didn't realise it was relevant until now.Cocke
I
2

Direct comparison

                        |  std::vector     | C memory functions
------------------------+------------------+------------------------
default capacity        | undefined        | undefined
default grow            | towards capacity | undefined
deterministic capacity  | available        | no
deterministic grow      | available        | no
deterministic mem.-move | available        | no
non-POD types           | yes              | f***ing no (*)
no-throw                | no               | yes

deterministic mem.-move follows from deterministic capacity/grow. It is when realloc and std::vector have to move their stored elements to a new memory location.

I think the (available) determinism with regards to memory moving is doubly important when you consider moving (smart) references of any kind.

NOTE: In this respect, I use the term "deterministic" with respect to my source codes lifetime, i.e. its lifetime across different versions of different libraries with different compile flags, etc..


Source

It does fragment memory as much as realloc does:

Class template vector overview [vector.overview]

The elements of a vector are stored contiguously, meaning that if v is a vector<T, Allocator> where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size()

In other words, the memory used is in one piece.

The one big difference is that realloc can actually increase allocated memory portions without having ordered it to do so, however, it is not required to do so (man 3 realloc):

man 3 realloc

The realloc() function changes the size of the memory block pointed to by ptr to size bytes. The contents will be unchanged in the range from the start of the region up to the minimum of the old and new sizes. If the new size is larger than the old size, the added memory will not be initialized. If ptr is NULL, then the call is equivalent to malloc(size), for all values of size; if size is equal to zero, and ptr is not NULL, then the call is equivalent to free(ptr). Unless ptr is NULL, it must have been returned by an earlier call to malloc(), cal‐ loc() or realloc(). If the area pointed to was moved, a free(ptr) is done.

So it can increase the size, but is not required to.

A std::vector carries not only a size, but also a capacity. If you know beforehand you will need a big vector, yet you cannot initialize everything right now, you are entitled to increase the capacity of your vector like so:

std::vector<T> vec(32);
vec.reserve(1024);
// vec has size 32, but reserved a memory region of 1024 elements

So, unlike realloc, the moment when reallocations occur can be deterministic with std::vector.

To answer your question: Because there is std::vector, realloc is not needed. And, realloc is not allowed for non-POD types; attempts to use malloc, free and realloc directly on non-PODs yields undefined behaviour.

Interface answered 8/4, 2014 at 9:37 Comment(16)
By fragmentation I mean of the address space, not of the vector itself - allocating a new block and deallocating the old one will leave a 'hole' whereas extending the existing allocation will not. I know that it's not required to, but it has the option to, and that's what interests me.Cocke
well realloc is still needed, as these are not the same things. And realloc can never throw.Feathers
@channel: But needed for what? Good point about throwing, I added that to my table.Interface
@BenHymers: But realloc could leave behind holes, too?Interface
Ok, you're absolutely correct that C realloc has many problems, I hadn't considered those before asking my question. So how's about if there was a C++ equivalent of it which acted more like new and handled PODs, exceptions and so on; could vector not benefit from that?Cocke
@phresnel Yes, could but also could not. That's what I'm getting at :) if there's enough free space to expand the vector into, why not make use of it?Cocke
-1 Bad answer... the meaningful issue is whether a C++ type can use realloc internally to do potentially expand memory use in place, not a comparison of raw realloc vs vector.Unhandsome
@Tony D: I agree I may not be explicit enough, but it answers So is it fair to always recommend it over realloc?. Really worth a -1?Interface
@TonyD Sadly the question I asked wasn't as interesting as that since I didn't think that far ahead - see my edit :) phresnel answered my question very well.Cocke
@phresnel: I don't think the question makes a lot of sense, and in trying to answer it you've written some stuff that's pretty meaningless. For example: that someone reallocs/mallocs enouogh bytes for for N objects gives as "deterministic capacity" as a vector that reserves similarly - some of your statements seem to imply no "current size" variable is being paired with the pointer to malloced memory. There's nothing more deterministic about growth of a vector than realloc - either may fail due to memory exhaustion, or seem to succeed then fail as the memory is actually accessed.Unhandsome
@BenHymers: ok then... -1 to +1 :-). Cheers.Unhandsome
I'm going to accept this, but really it's only the comparison table and the last sentence that answer the question (could be summarised with "C memory allocation is nasty"). Tony's right in saying that the middle bit is unnecessary/unfair.Cocke
@TonyD: But how are malloc/free/realloc implementation details relevant to C++ application code? And how do those implementation details make it deterministic to the application when the application is not allowed to make use of those implementation details? How tight do you couple your applications to specific versions of specific standard library implementations on specific platforms using specific compile flags?Interface
@Pigmy Hymers: I am still convinced that the middle part is not unfair, as I think it is the application's POV that counts wrt determinism, and not the library's. If the standards do not describe when a realloc does something, than I am in the school that says I can't rely on it; I then consider it indeterministic w.r.t. my code's lifetime. / I added a NOTE below the table.Interface
@phresnel Non-deterministic, but in the right direction :) I mentioned in a comment on Mike Seymour's answer, "I'm happy if peformance and memory usage patterns are occasionally randomly better than normal". The whole point of bringing up realloc in the first place is that it has the option to extend an existing allocation, which sounds nice. Also note that I'm talking purely about when the vector needs to allocate, and that a vector using realloc would still allocate extra capacity and have a reserve method; it's just what happens when allocating that would differ.Cocke
This answer basically creates a straw-man: a container that uses realloc for every size change, instead of internally tracking capacity with separate capacity + size members. Of course that would be terrible. The right way to use realloc is to use std::vector on top of it (for trivially copyable types), instead of a new/copy/delete loop. So you sometimes still get a copy on growth, but sometimes you get to avoid that perf hit and just expand an existing allocation. Same worst case, much better average and much better best case, especially with lots of free virtual address space.Cammie
U
10

new/new[] and delete/delete[] are typically layered atop the C library allocation functions (ala malloc/realloc/free), possibly with an extra layer for small-object optimisations that use one malloc-ed region for quickly satisfying many small new requests. This layering meant supporting new and delete took very little implementation effort on the part of early C++ library authors.

To utilise the in-place resizing functionality in realloc for C++, though, invasive changes to the realloc library function are needed so that if movement to a new memory region is required, the C++ library code gets the chance to copy-construct/destruct the objects being moved. This could be done as:

  • a callback happening after realloc realised a move was necessary, asking the C++ library to do the actual data movement instead of doing a memcpy() style bytewise copy, or

  • as an additional resize-in-place-or-fail-without-moving function so the C++ library code could try that, then fall back on a malloc and proper/safe copy before deleteing the original objects and deallocating the original memory.

As most C library realloc functions lack any such hook/query facility, the C++ Standard - and Standard library - don't require it. As Mehrdad points out, this answer documents SGI's acknowledgement of this issue.

Given the extensive use of C++ these days, it would - IMHO - make sense to ship a malloc/realloc/free implementation in the C++ library itself that does provide such a hook/query, so that C++ library authors who see utility in realloc can utilise it freely; that'd be a worthy candidate for inclusion in a future Standard.

With great care, couldn't you write something that works just like std::vector but using C allocation functions, which has the possibility to grow its memory without moving its address and copying existing elements, making it as good or better in terms of fragmentation and performance?

As above - no - it's not possible to copy-construct/destruct objects with any amount of care without changes to the realloc API.

Unhandsome answered 8/4, 2014 at 10:10 Comment(24)
Excellent answer to the bonus question! Thank you!Cocke
-1: One major major C++ implementation does not make use of realloc: gcc.gnu.org/viewcvs/gcc/trunk/libstdc++-v3/libsupc++/… . Neither do I find anything relating realloc in the allocator/vector headers. However, maybe I did not research hard enough, so potential for +1 if you edit in some sources.Interface
Was the "malloc/realloc/free" in the first sentence a typo and you meant "malloc/free"? I assumed it was since the answer makes more sense that way.Cocke
Let me remove my -1, too, as your latest edit adds quite some value. However, I would really value if you could name examples or sources for new/delete implementations that make use of malloc/free/realloc as you describe.Interface
@phresnel: you're missing my point - on line 42 of the source you link you'll see malloc being called, proving the implementation is using C-library heap management, which means realloc could be used to attempt in-place growth of a memory region, but my answer explains that any actual use of realloc can't be coordinated with proper copy-construction/destruction for C++ objects in the "have to move" case, as the interface doesn't provide a pre-move hook nor tentative resize-only-if-you-can-do-it-in-place option.Unhandsome
@BenHymers: I did mean malloc/realloc/free - they're all related functions and indeed realloc can be used for initial memory allocation and for deallocation too, so a C++ library could only use realloc and never use malloc or free if it wanted to, even without ever trying to resize any existing allocation to anything other than 0 bytes (a free). That would make for more confusing code though. Summarily: realloc(0, n) === malloc(n), and realloc(p, 0) === free(p).Unhandsome
Yes, but what I don't see are for example the small-object optimizations, or, as you already name, any use of realloc. All I see in that implementation is a thin wrapper around malloc and get_new_handler. I actually approve of your second paragraph, though, which is why I removed by -1.Interface
@TonyD Ok, that makes perfect sense when combined with your latest edit. Good answer, thanks again!Cocke
@phresnel: small-object optimisations aren't mandated - purely an implementation library choice. Particularly for a system like Linux, GNU have the freedom to tune the C library routines to handle small objects quite nicely anyway, so there's less incentive to layer C++ library optimisations over them. It's historically been more relevant for third party C++ compilers ported to OSes with varying quality C libraries. In C, there tends to be fewer and larger allocations, with longer lifetimes, so the libraries were often less heavily tuned.Unhandsome
I realise that I took your first phrase for "Typically, they do small object optimizations", instead of "Possibly, they do so". While I would still appreciate actual source code examples or citations, there is nothing left preventing me to +1 your answer :) I think with one of the memory gurus as a ex team member (Ulrich Drepper, author of What Every Programmer Should Know About Memory and ...: Linux Concurrency and Performance) I very much trust malloc to be near optimal.Interface
@phresnel: "what I don't see... any use of realloc" - my primary point in "typically layered atop C malloc/realloc/free" is that implementations are using the C heap facility associated with that three-function API, not specifically that you'd expect them to be using realloc, though the equivalences - realloc(0, n) === malloc(n) and realloc(p, 0) === free(p) and generality of realloc make it possible and occasionally elegant to use even when not attempting any in-place resizing, so a C++ implementation could currently use it for new/delete.Unhandsome
Yes, I basically just accidentally ignored your possibly right in the first phrase. However, I think the discussion in these comments was worth the mistake. You have good points here.Interface
@phresnel: you trust which implementation of malloc to be near optimal - it's not a single thing ;-). VMS, MVS, Cray, OS/360, embedded... all manner of systems out there, some tuned for typical use from COBOL, FORTRAN or C, others prioritorising less machine code or clean source code in malloc/free implementations over performance in edge cases. I've seen a few small-object optimisation libraries - think there was code in Alexandrescu's Modern C++ Design for that too? - but I'm not sure which C++ Std libs shipped their own implementation rather than trusting to malloc. I'll dig! :-)Unhandsome
Haha, thanks, too. I think your answer should be upvoted more.Interface
@TonyDelroy - for what it's worth, I think realloc could be used as the underlying mechanism for allocation during vector resizing, but it would have to be specialized only for trivially copyable types (something like what used to be called POD types). That's actually not that unusual: most standard library implementations already specialize containers for trivially copyable types so they can copy elements with memcpy - in addition to other specializations like skipping destruction if the destructor is trivial, etc.Purposive
... well there is the little matter of the trivial copy allowance (you can bitwise copy these objects) applying specifically to memcpy but not explicitly to realloc. That's a minor clarification away. The more difficult problem is that the standard lets you replace operator new and friends, which - assuming that the containers are defined to use them - makes it tough to use another allocation routine since it would change observable behavior (the replaced versions wouldn't be getting called any more).Purposive
@BeeOnRope: great points, though PODs can have scalar types which include pointers, and if any pointers happen to be to elsewhere in the vector then they'd need to be updated during any memory move, so POD's not quite the right criteria. There's certainly a set of types for which it's safe, but so many type traits these days it's tedious working out if there's already a suitable one. CheersUnhandsome
Well trivially copyable is the criteria given in the standard, but I was wrong to call them "something like POD types". POD, now called "standard layout" types is all about the data layout and doesn't place restrictions on say what copy constructors are available. The set of trivially copyable types (whose members are neither a subset or superset of standard layout types) are defined exactly based on whether they can be bit-wise copy. The definition covers both implementation restrictions that would likely prevent copying (e.g,. no virtual functions) and ...Purposive
... conditions that ensure that user-defined behavior (such as fixing up pointers or any other invariant) is not skipped: they cannot have user-defined copy or move constructors. In that case, copying an object byte-wise is "completely safe" and essentially all modern standard libraries take advantage of this in their container and std::copy implementations. By "completely safe" I mean that you get the same effect as if you had copied the object via the normal language mechanism of copy constructor or assignment.Purposive
@Purposive "same effect" is the crucial bit there: in the scenario I mentioned - say struct X { char c; const X* p_next; }; where the pointers in a number of Xs were wired together into a logical linked list, it's not inherently safe to relocate them (the programmer would have to do some post-facto rewiring, or add a some constructor/assignment logic with a registry of other object instances). That said, such a type wouldn't be safe in a vector increasing capacity anyway, so I shouldn't have side-tracked us with that scenario. is_trivially_copyable is indeed a good criterion - cool.Unhandsome
Of course, if the user has in mind other invariants in mind that are not maintained by the copy constructor, but rather by some external mechanism or rule (e.g., after you copy this object, update some pointer on it via a method call) then such objects aren't suitable to be used in containers at all (they will break regardless of whether the byte-wise copy optimization is used). (I wrote this at the same time time as my earlier two comments, but it didn't get submitted, but yeah it seems we are in violent agreement here).Purposive
I decided to ask about realloc specifically over here. The quick summary is that by a strict reading of the standard, even memcpy can't be used to "create" an object in uninitialized storage (you need placement new to do that), so realloc is probably technically not allowed to move objects since it gives you no chance to create the objects and is not defined to do so. Bleh.Purposive
"by a strict reading of the standard, even memcpy can't be used" - your phrasing makes it sound like a strictly conforming compiler/implementation isn't allowed to have memcpy work, but the situation is that the Standard doesn't require an implementation to support memcpy (or aiming a pointer at memory from another language, network packet, shared memory etc. for that matter), but implementations are fully allowed to define behaviour that's left undefined by the Standard, and there's wide support. Whether that's good enough is for the programmer/team/company to decide.Unhandsome
@TonyDelroy: Unfortunately, the Standard fails to make any distinction between constructs which implementations should be expected to support reliably absent an obviously-good or documented reason for doing otherwise, with those which few if any implementations should be expected to support reliably, and the maintainers of compilers that are popular by virtue of being open-source interpret the Standard's failure to mandate behavior as being a good enough reason in and of itself to deviate from commonplace behaviors.Corin
I
2

Direct comparison

                        |  std::vector     | C memory functions
------------------------+------------------+------------------------
default capacity        | undefined        | undefined
default grow            | towards capacity | undefined
deterministic capacity  | available        | no
deterministic grow      | available        | no
deterministic mem.-move | available        | no
non-POD types           | yes              | f***ing no (*)
no-throw                | no               | yes

deterministic mem.-move follows from deterministic capacity/grow. It is when realloc and std::vector have to move their stored elements to a new memory location.

I think the (available) determinism with regards to memory moving is doubly important when you consider moving (smart) references of any kind.

NOTE: In this respect, I use the term "deterministic" with respect to my source codes lifetime, i.e. its lifetime across different versions of different libraries with different compile flags, etc..


Source

It does fragment memory as much as realloc does:

Class template vector overview [vector.overview]

The elements of a vector are stored contiguously, meaning that if v is a vector<T, Allocator> where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size()

In other words, the memory used is in one piece.

The one big difference is that realloc can actually increase allocated memory portions without having ordered it to do so, however, it is not required to do so (man 3 realloc):

man 3 realloc

The realloc() function changes the size of the memory block pointed to by ptr to size bytes. The contents will be unchanged in the range from the start of the region up to the minimum of the old and new sizes. If the new size is larger than the old size, the added memory will not be initialized. If ptr is NULL, then the call is equivalent to malloc(size), for all values of size; if size is equal to zero, and ptr is not NULL, then the call is equivalent to free(ptr). Unless ptr is NULL, it must have been returned by an earlier call to malloc(), cal‐ loc() or realloc(). If the area pointed to was moved, a free(ptr) is done.

So it can increase the size, but is not required to.

A std::vector carries not only a size, but also a capacity. If you know beforehand you will need a big vector, yet you cannot initialize everything right now, you are entitled to increase the capacity of your vector like so:

std::vector<T> vec(32);
vec.reserve(1024);
// vec has size 32, but reserved a memory region of 1024 elements

So, unlike realloc, the moment when reallocations occur can be deterministic with std::vector.

To answer your question: Because there is std::vector, realloc is not needed. And, realloc is not allowed for non-POD types; attempts to use malloc, free and realloc directly on non-PODs yields undefined behaviour.

Interface answered 8/4, 2014 at 9:37 Comment(16)
By fragmentation I mean of the address space, not of the vector itself - allocating a new block and deallocating the old one will leave a 'hole' whereas extending the existing allocation will not. I know that it's not required to, but it has the option to, and that's what interests me.Cocke
well realloc is still needed, as these are not the same things. And realloc can never throw.Feathers
@channel: But needed for what? Good point about throwing, I added that to my table.Interface
@BenHymers: But realloc could leave behind holes, too?Interface
Ok, you're absolutely correct that C realloc has many problems, I hadn't considered those before asking my question. So how's about if there was a C++ equivalent of it which acted more like new and handled PODs, exceptions and so on; could vector not benefit from that?Cocke
@phresnel Yes, could but also could not. That's what I'm getting at :) if there's enough free space to expand the vector into, why not make use of it?Cocke
-1 Bad answer... the meaningful issue is whether a C++ type can use realloc internally to do potentially expand memory use in place, not a comparison of raw realloc vs vector.Unhandsome
@Tony D: I agree I may not be explicit enough, but it answers So is it fair to always recommend it over realloc?. Really worth a -1?Interface
@TonyD Sadly the question I asked wasn't as interesting as that since I didn't think that far ahead - see my edit :) phresnel answered my question very well.Cocke
@phresnel: I don't think the question makes a lot of sense, and in trying to answer it you've written some stuff that's pretty meaningless. For example: that someone reallocs/mallocs enouogh bytes for for N objects gives as "deterministic capacity" as a vector that reserves similarly - some of your statements seem to imply no "current size" variable is being paired with the pointer to malloced memory. There's nothing more deterministic about growth of a vector than realloc - either may fail due to memory exhaustion, or seem to succeed then fail as the memory is actually accessed.Unhandsome
@BenHymers: ok then... -1 to +1 :-). Cheers.Unhandsome
I'm going to accept this, but really it's only the comparison table and the last sentence that answer the question (could be summarised with "C memory allocation is nasty"). Tony's right in saying that the middle bit is unnecessary/unfair.Cocke
@TonyD: But how are malloc/free/realloc implementation details relevant to C++ application code? And how do those implementation details make it deterministic to the application when the application is not allowed to make use of those implementation details? How tight do you couple your applications to specific versions of specific standard library implementations on specific platforms using specific compile flags?Interface
@Pigmy Hymers: I am still convinced that the middle part is not unfair, as I think it is the application's POV that counts wrt determinism, and not the library's. If the standards do not describe when a realloc does something, than I am in the school that says I can't rely on it; I then consider it indeterministic w.r.t. my code's lifetime. / I added a NOTE below the table.Interface
@phresnel Non-deterministic, but in the right direction :) I mentioned in a comment on Mike Seymour's answer, "I'm happy if peformance and memory usage patterns are occasionally randomly better than normal". The whole point of bringing up realloc in the first place is that it has the option to extend an existing allocation, which sounds nice. Also note that I'm talking purely about when the vector needs to allocate, and that a vector using realloc would still allocate extra capacity and have a reserve method; it's just what happens when allocating that would differ.Cocke
This answer basically creates a straw-man: a container that uses realloc for every size change, instead of internally tracking capacity with separate capacity + size members. Of course that would be terrible. The right way to use realloc is to use std::vector on top of it (for trivially copyable types), instead of a new/copy/delete loop. So you sometimes still get a copy on growth, but sometimes you get to avoid that perf hit and just expand an existing allocation. Same worst case, much better average and much better best case, especially with lots of free virtual address space.Cammie
K
2

There is a good reason why C++ doesn't have realloc; instead of repeating it, I'll point you to the answer here.

And for the record, a proper vector implementation mitigates the fragmentation issue to some extent, by choosing a growth factor close to the golden ratio, so it's not completely a lost cause.

Kyongkyoto answered 8/4, 2014 at 9:43 Comment(4)
I'm not sure how their explanation helps; they don't say why it prohibits C realloc, only that it does. C and C++ memory allocation aren't supposed to interact so I don't understand why there could be a cplusplusrealloc which doesn't interfere with C memory allocation but can be used with new/delete.Cocke
@BenHymers: It's explained on the page they link to. The problem is the fact that realloc copies raw memory, which means it doesn't work for C++ because C++ requires using copy-constructors. Yes, they could have added a different reallocation function instead, but they explain the reason why they didn't do that: "This in turn would have added complexity to many allocator implementations, and would have made interaction with memory-debugging tools more difficult. Thus we decided against this alternative."Kyongkyoto
Haha, deciding not to add something to C++ because it would add complexity, that makes me chuckle :) But ok, it's an explanation, fair point.Cocke
@BenHymers: I know, right? I suppose back then it wasn't quite the monster it is now. =PKyongkyoto
W
1

std::vector fragments memory as it grows because C++ has no equivalent of realloc.

realloc will fragment the memory in the same way, since it's doing the same thing - allocating a new block and copying the contents, if the old block wasn't large enough.

With great care, couldn't you write something that works just like std::vector but using C allocation functions, which has the possibility to grow its memory without moving its address and copying existing elements, making it as good or better in terms of fragmentation and performance?

That's just what vector does. You can control the capacity independently of the size, and it will only reallocate when the capacity is exceeded. realloc is similar, but with no means to control the capacity - making vector better in terms of fragmentation and performance.

why doesn't C++ have an equivalent to realloc?

Because it has std::vector, which does the same thing with more flexibility. As well as allowing you to control exactly how and when memory is allocated, it works with any movable type, while realloc will go horribly wrong when used with non-trivial types.

Workbench answered 8/4, 2014 at 9:38 Comment(5)
"realloc will fragment the memory in the same way" - not necessarily; it might extend the existing memory block, that's the point :) Granted, a realloc-based container would fail for non-trivial types, I hadn't considered that.Cocke
@BenHymers: "it might extend the existing block" - or it might not, and you have no way to control that. If fragmentation matters, then vector gives you the tools to manage it, using reserve or a custom allocator; realloc just does whatever it does. If fragmentation doesn't matter, then it doesn't matter.Workbench
@MikeSeymour ... but it also might! I'm happy if peformance and memory usage patterns are occasionally randomly better than normal. vector gives me some tools, sure, but it can't allow me to extend an existing memory allocation. And reserve only helps if I have an idea of capacity beforehand.Cocke
The "point" that std::vector has all these controls to manage capacity and size such as reserve while realloc doesn't make no sense to me. realloc is not a container. We are talking about implementing a container, like vector, on top of realloc. Obviously a good implementation of such a container would also have a "size" and "capacity" and wouldn't blindly call realloc every time an element was added (which would likely have terrible performance and lead to fragmentation).Purposive
Now you'd have something with the same worst-case performance, functionality and so on as vector, but sometimes you'd get the "free win" of being able to leave your elements in-place and not have to copy them on an expand - with all the memory bandwidth, cache thrashing and CPU time that involves. Of course, the downside is that it only works with trivially copyable types (technically it can work with any types which are "bit-wise movable" but C++ hasn't formalized such a concept yet).Purposive

© 2022 - 2024 — McMap. All rights reserved.