Initial capacity of vector in C++
Asked Answered
I

6

120

What is the capacity() of an std::vector which is created using the default constuctor? I know that the size() is zero. Can we state that a default constructed vector does not call heap memory allocation?

This way it would be possible to create an array with an arbitrary reserve using a single allocation, like std::vector<int> iv; iv.reserve(2345);. Let's say that for some reason, I do not want to start the size() on 2345.

For example, on Linux (g++ 4.4.5, kernel 2.6.32 amd64)

#include <iostream>
#include <vector>

int main()
{
  using namespace std;
  cout << vector<int>().capacity() << "," << vector<int>(10).capacity() << endl;
  return 0;
}

printed 0,10. Is it a rule, or is it STL vendor dependent?

Interloper answered 4/9, 2012 at 20:37 Comment(4)
Standard doesn't specifies anything about initial capacity of vector but most implementations use 0 .Han
There's no guarantee, but I would seriously question the quality of any implementation that allocated memory without me requesting any.Devitalize
@MikeSeymour Disagree. A really high performance implementation might contain a small inline buffer, in which case setting the initial capacity() to that would make sense.Abash
@alastair When using swap all iterators and references remain valid (except end()s). That means that an inline buffer is not possible.Interloper
B
86

The standard doesn't specify what the initial capacity of a container should be, so you're relying on the implementation. A common implementation will start the capacity at zero, but there's no guarantee. On the other hand there's no way to better your strategy of std::vector<int> iv; iv.reserve(2345); so stick with it.

Blooper answered 4/9, 2012 at 20:44 Comment(13)
I don't buy your last statement. If you cannot rely on the capacity to be 0 initially, you might restructure your program to allow your vector to have an initial size. This would half the number of heap-memory requests (from 2 to 1).Aggrieve
@bitmask: Being practical: do you know of any implementation where a vector allocating memory in the default constructor? It is not guaranteed by the standard, but as Mike Seymour points out triggering an allocation without the need would be a bad smell regarding the quality of implementation.Adulterer
@DavidRodríguez-dribeas: That's not the point. The premise was "you cannot do better than your current strategy, so do not bother wondering if there might be stupid implementations". If the premise was "there are no such implementations, so don't bother" I would buy it. The conclusion happens to be true, but the implication doesn't work. Sorry, maybe I'm nit picking.Aggrieve
@Aggrieve If there exists an implementation that allocates memory on default construction, doing what you said would halve the number of allocations. But vector::reserve is not the same as specifying an initial size. The vector constructors that take an initial size value/copy initialize n objects, and thus have linear complexity. OTOH, calling reserve only means copying / moving of size() elements if a reallocation is triggered. On an empty vector there is nothing to copy. So the latter might be desirable even if the implementation allocates memory for a default constructed vector.Liesa
@Prætorian: That's why I wrote "restructure your program". If you reserve a specific amount of cells, chances are you are going to populate them eventually. I'm just saying that "there is no way to better your strategy" is too bold a statement to be without justification.Aggrieve
@bitmask: Also, restructuring the code to have the vector start with objects is not something simple to do. Where will you store the objects before you can create the vector? (Note that in general the object might not have a default constructor, or it might be expensive to construct/copy, or...) I agree with Mark: the best strategy that can be applied in general is default construct + reserve.Adulterer
@DavidRodríguez-dribeas: Yes, if you have no information about the type. However consider a hypothetical type with a trivial constructor that will be optimised out anyway. If you are worrying about execution time to this degree, you have to take things into account that look silly in theory but not in practice.Aggrieve
@bitmask: So if you change the premises of the question you get to a different answer... interesting :) In general, the subset of objects for which it does not matter is small compared with the situations where the constructor is not trivial (unless you are a fan of two-phase initialization, which I am not and I recommend against). Given the premise: I don't want to create the object with the full size, the conclusion is Mark's answer.Adulterer
@bitmask, if you're concerned about the allocations to this degree then you should look at your particular standard library's implementation and not rely on speculation.Blooper
@bitmask: Even if the initial capacity were non-zero, this formulation would be enormously faster than specifying an initial size on any modern O/S. For any non-trivial size of memory block -- which is the only time you would care -- allocation is many times faster than initialization. If you doubt this, write a test program comparing the speeds of malloc vs calloc for large blocks.Randa
Scott Meyers effective STL item 14 gives a nice overview related to this issue in my opinion.Border
@Aggrieve vector when sizing up must value-initialise its elements, so I don't see how even the most trivial constructor could ever be optimised out... unless the stdlib had insider knowledge that the OS or allocator would have zero-initialised the allocated memory in advance, which seems unlikely to me.Pace
since C++17 the default constructor is noexcept (noexcept Allocator()). I think this can only work when no capacity is reserved on default constructing the vectorDiakinesis
P
67

Storage implementations of std::vector vary significantly, but all the ones I've come across start from 0.

The following code:

#include <iostream>
#include <vector>

int main()
{
  using namespace std;
  
  vector<int> normal;
  cout << normal.capacity() << endl;
  
  for (unsigned int loop = 0; loop != 10; ++loop)
  {
      normal.push_back(1);
      cout << normal.capacity() << endl;
  }
  
  cin.get();
  return 0;
}

Gives the following output:

0
1
2
4
4
8
8
8
8
16
16

under GCC 5.1, 11.2 - Clang 12.0.1 and:

0
1
2
3
4
6
6
9
9
9
13

under MSVC 2013.

Prenomen answered 2/4, 2016 at 8:34 Comment(11)
This is so underrated @AndrewHaplo
Well you find virtually everywhere that the recommendation for speed purposes is almost always to just use a vector, so if you're doing anything that involves sparse data...Embattled
@Embattled what should they have started it at? allocating anything would just be wasting time allocating and deallocating that memory if the programmer wants to reserve more than the default. if you're assuming they should begin with 1, it'll allocate that as soon as somebody is allocating 1 anyway.Hanky
@Hanky You're reading in between the lines instead of taking it at face value. The clue that it's not sarcasm is the word "smart", as well as my second comment mentioning sparse data.Embattled
@Embattled Oh good, you were relieved enough they started it at 0. Why even comment about it in a joking way?Hanky
@Hanky Wasn't a joke, that's my point. It was relief that they did things the right way and not in some screwy fashion, particularly given that this is vector we're talking about, which is super important, useful, and highly used.Embattled
It is really surprising to me that the capacity would go to 1 byte on first allocation instead of at least the number of bytes the allocator reserves to manage its chunks. (so 16 bytes in a 64 bit implementation). Not only that, most of the times an allocation will be aligned so the pointers are aligned for things such as SSE4 or even AVX512...Bannerman
@AlexisWilke 1 element, not 1 byte. Element could be 1000 bytes for all the container knows. However my containers always base the first allocation chunk size on the sizeof(container) vs sizeof(element) and whatever seems reasonable (plf::).Prenomen
std::string does small string optimization, so I was hoping that std::vector too will have some initial non zero size.Dyer
std::vector usually increase vector size exponentially as capacity gets exhausted. In GCC results above we see that the factor is 2. This gives O(1) amortized time for insertion.cs.stackexchange.com/questions/9380/…Dyer
Clang 10.0.1 and GCC 10.2 agree with the GCC 5.1 results in this answer. godbolt.org/z/3nzG6nKerman
E
9

As far as I understood the standard (though I could actually not name a reference), container instanciation and memory allocation have intentionally been decoupled for good reason. Therefor you have distinct, separate calls for

  • constructor to create the container itself
  • reserve() to pre allocate a suitably large memory block to accomodate at least(!) a given number of objects

And this makes a lot of sense. The only right to exist for reserve() is to give you the opportunity to code around possibly expensive reallocations when growing the vector. In order to be useful you have to know the number of objects to store or at least need to be able to make an educated guess. If this is not given you better stay away from reserve() as you will just change reallocation for wasted memory.

So putting it all together:

  • The standard intentionally does not specify a constructor that allows you to pre allocate a memory block for a specific number of objects (which would be at least more desirable than allocating an implementation specific, fixed "something" under the hood).
  • Allocation shouldn't be implicit. So, to preallocate a block you need to make a separate call to reserve() and this need not be at the same place of construction (could/should of course be later, after you became aware of the required size to accomodate)
  • Thus if a vector would always preallocate a memory block of implementation defined size this would foil the intended job of reserve(), wouldn't it?
  • What would be the advantage of preallocating a block if the STL naturally cannot know the intended purpose and expected size of a vector? It'll be rather nonsensical, if not counter-productive.
  • The proper solution instead is to allocate and implementation specific block with the first push_back() - if not already explicitely allocated before by reserve().
  • In case of a necessary reallocation the increase in block size is implementation specific as well. The vector implementations I know of start with an exponential increase in size but will cap the increment rate at a certain maximum to avoid wasting huge amounts of memory or even blowing it.

All this comes to full operation and advantage only if not disturbed by an allocating constructor. You have reasonable defaults for common scenarios that can be overriden on demand by reserve() (and shrink_to_fit()). So, even if the standard does not explicitely state so, I'm quite sure assuming that a newly constructed vector does not preallocate is a pretty safe bet for all current implementations.

Eliaseliason answered 19/4, 2018 at 8:34 Comment(0)
T
7

As a slight addition to the other answers, I found that when running under debug conditions with Visual Studio a default constructed vector will still allocate on the heap even though the capacity starts at zero.

Specifically if _ITERATOR_DEBUG_LEVEL != 0 then vector will allocate some space to help with iterator checking.

https://learn.microsoft.com/en-gb/cpp/standard-library/iterator-debug-level

I just found this slightly annoying since I was using a custom allocator at the time and was not expecting the extra allocation.

Theodore answered 1/3, 2018 at 10:44 Comment(1)
Interesting, they break the noexcept-guarantees (at least for C+17, earlier?): en.cppreference.com/w/cpp/container/vector/vectorHauteur
V
7

This is an old question, and all answers here have rightly explained the standard's point of view and the way you can get an initial capacity in a portable manner by using std::vector::reserve;

However, I'll explain why it doesn't make sense for any STL implementation to allocate memory upon construction of an std::vector<T> object;

  1. std::vector<T> of incomplete types;

    Prior to C++17, it was undefined behavior to construct a std::vector<T> if the definition of T is still unknown at point of instantiation. However, that constraint was relaxed in C++17.

    In order to efficiently allocate memory for an object, you need to know its size. From C++17 and beyond, your clients may have cases where your std::vector<T> class does not know the size of T. Does it makes sense to have memory allocation characteristics dependent on type completeness?

  2. Unwanted Memory allocations

    There are many, many, many times you'll need model a graph in software. (A tree is a graph); You are most likely going to model it like:

    class Node {
        ....
        std::vector<Node> children; //or std::vector< *some pointer type* > children;
        ....
     };
    

    Now think for a moment and imagine if you had lots of terminal nodes. You would be very pissed if your STL implementation allocates extra memory simply in anticipation of having objects in children.

    This is just one example, feel free to think of more...

Vedavedalia answered 22/12, 2019 at 0:35 Comment(0)
C
2

Standard doesnt specify initial value for capacity but the STL container automatically grows to accomodate as much data as you put in, provided you don't exceed the maximum size(use max_size member function to know). For vector and string, growth is handled by realloc whenever more space is needed. Suppose you'd like to create a vector holding value 1-1000. Without using reserve, the code will typically result in between 2 and 18 reallocations during following loop:

vector<int> v;
for ( int i = 1; i <= 1000; i++) v.push_back(i);

Modifying the code to use reserve might result in 0 allocations during the loop:

vector<int> v;
v.reserve(1000);

for ( int i = 1; i <= 1000; i++) v.push_back(i);

Roughly to say, vector and string capacities grow by a factor of between 1.5 and 2 each time.

Cocteau answered 1/3, 2018 at 11:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.