Can you prevent std::vector bounds-checking on push_back()?
Asked Answered
P

1

8

I know there are ways to either do or not do bounds-checking when indexing a vector, but specifically on push_back(), even if I know the capacity of the vector is large enough (i.e., I reserved enough), and I ran a loop pushing back elements into it, I'd assume that since it's dynamically resizing it would always have to do bounds-checking (or size checking) on every push_back.

If this is the case, then I was thinking something like a fast_push() would be useful if you can know you won't exceed the capacity.

I've heard claims of some vector libraries being faster (example), but I didn't see specifically the issue of the push_back() when knowing bounds checking won't be needed. The one in the link makes claims of up to 50% speed improvements. One of them being preventing value initialisation on construction when you don't need it, and some other things.

Punner answered 25/7, 2017 at 21:48 Comment(15)
If you call reserve(), the compiler can usually optimize the push_back() really well. Standard library writers and compiler writers really try to optimize vectorSkippie
Checking that there is enough room in the vector is just comparing two numbers: the current upper bound and the current size. It's not a bottleneck in your code.Rosalvarosalyn
FWIW you can take vector implementation from the standard library, make some re-naming, add your fast_push function and realize that it does not deliver any significant performance gain :)Josefina
With what object do you instantiate the vector? Maybe you can go resize the vector beforehand, and use []Janejanean
@PeteBecker: It may not be the bottleneck. But in certain situations, unnecessary checking does make a program slower. Even if it is just a simple integer compare.Janejanean
if it had significant perf impact then i would have thought that the STL designers would have allowed a naked_push. I think this is a good question thoughTropic
@pm100: in very small loops, it could have a significant perf impact. STL is a general library, not equally good for everything. I could tell you examples, where using STL has a very significant performace drop.Janejanean
The one in the link implements a different specification.Rosalvarosalyn
If you're really concerned about maximum possible performance, maximum customization, then you should create your own library. In general, with std::vector, it is not possible what you asking.Janejanean
Thanks for your help. I get everyone's point about the impact being negligible, I was just thinking, you know, in big loops it IS an unnecessary check. I was creating 3D terrain meshes, pushing back a lot of vertices. Yes, I get std:: isn't necessarily ideal for performance goals.Punner
3D matrix as in vector<vector<vector<point>>>? If so, your enemy is probably going to turn out to be poor spatial locality. A vector is contiguous, but there is no such guarantee for vectors of vectors, so a not-insignificant amount of time will be spent hopping from vector to vector and waiting around for the cache to load.Volplane
Yeah I thought about. A 3D std::vector is always essentially looking up the address of the buffer pointer, which is in three different places. I guess a 3-dimensional C array is one contiguous string of memory, and has an advantage. I'd imagine working with raw C arrays would be a lot harder though. I've wanted to try to create a voxel based game, but just thinking of the 3-dimensional array makes my head spin.Punner
to expand @user4581301's point: wrap your std::vector<point> in a class Matrix and do the (i * rows) + (k * cols) + j calculation in a methodCalla
If you can restructure your code to use ranged-insert instead of push_back and supply random-access iterators then I would expect a single bounds check rather than one per element as a basic QoI matter.Shipley
you can resize the vector before the cycle and then insert the elements in the corresponding positionHenna
P
-1

I think you cannot do that, because bounds-checking for push_back() is more or less defined by the C++ standard:

cpp reference link

If after the operation the new size() is greater than old capacity() a reallocation takes place, in which case all iterators (including the end() iterator) and all references to the elements are invalidated. Otherwise only the end() iterator is invalidated.

C++ 17 standard draft link

Remarks: Causes reallocation if the new size is greater than the old capacity. Reallocation invalidates all the references, pointers, and iterators referring to the elements in the sequence. If no reallocation happens, all the iterators and references before the insertion point remain valid. If an exception is thrown other than by the copy constructor, move constructor, assignment operator, or move assignment operator of T or by any InputIterator operation there are no effects.

However, you can use data() method to directly access underlying array, but in this case the size of the vector will remain unchanged. At this point it would be better to use std::array or a raw C array, because such usage of std::vector defeats its purpose: after any push_back() or resize() all data which comes after size() will be lost, so basically you will have to either never change size of vector (in which case why not use a raw C array?) or manually check size of array and resize if needed.

#include <vector>
#include <iostream>
    
int main(int argc, const char** argv) {
    std::vector<int> v;
    v.reserve(5);
    v.data()[0] = 3;
    v.data()[1] = 4;
    std::cout << v.data()[0] << '\n';
    std::cout << v[1] << '\n'; // you shouldn't do that, that's undefined behaviour
}

In the case of push_back(), bounds-checking is unavoidable because you need to change value of size() to check whether its value is less than capacity() and resize if so. It most probably isn't a bottleneck, since it's just a comparison of two numbers, which is a fairly fast operation.

Here is an example of MSVC source code and disassembly:

Source code (pay attention to _Mylast != _My_data._Myend part):

auto& _My_data   = _Mypair._Myval2;
pointer& _Mylast = _My_data._Mylast;
    
if (_Mylast != _My_data._Myend) {
    return _Emplace_back_with_unused_capacity(_STD forward<_Valty>(_Val)...);
}

and part of its disassembly:

mov         dword ptr [rsp+38h],5  
mov         rdx,qword ptr [rsp+28h]  
cmp         rdx,qword ptr [rsp+30h]  
je          main+0ADh (07FF7D092124Dh) 

In case of MSVC it's just one single je instruction to compare pointer of newly added element and end(), and if you are really searching for any performance improvements you should look somewhere else).

Presumably answered 25/7 at 9:31 Comment(2)
This incorrectly assumes that v[0] is valid after v.reserve(5). In reality, this is Undefined Behavior.Magdeburg
Thank you for pointing this out. Most implementations of operator[] still act more or less as v.data()[0], but yea, that's importantPresumably

© 2022 - 2024 — McMap. All rights reserved.