Best way to extract a subvector from a vector?

Asked 7/1, 2009 at 18:56 Answered 15/9, 2022 at 10:6

402

Suppose I have a std::vector (let's call it myVec) of size N. What's the simplest way to construct a new vector consisting of a copy of elements X through Y, where 0 <= X <= Y <= N-1? For example, myVec [100000] through myVec [100999] in a vector of size 150000.

If this cannot be done efficiently with a vector, is there another STL datatype that I should use instead?

Onshore answered 7/1, 2009 at 18:56 Comment(4)

you say you want to extract a subvector, but it seems to me that what you really want is a view / access to the subvector - difference being that a view would not copy - old school C++ would be to use start pointer and end pointer, given the fact that mem on a std::vector is contiguous, then it should be possible for you to iterate using pointers and thereby avoid copy, however if you do not mind copy, then just initialize a new vector with the scope of your previous vector – Heterotypic 13/2, 2017 at 13:18

There is .data()(cplusplus.com/reference/vector/vector/data) since c++11. However, using pointers is discouraged within stl containers , see #31664270 – Supraorbital 6/11, 2019 at 6:46

@Heterotypic maybe not interested to OP but I would need to know how to " initialize a new vector with the scope of your previous vector". – Foregone 11/4, 2021 at 10:10

Note) There's also assign in std::vector, such as v.assign(s.begin()+M, s.begin()+N); – Haemoid 6/7, 2023 at 10:35

482

vector<T>::const_iterator first = myVec.begin() + 100000;
vector<T>::const_iterator last = myVec.begin() + 101000;
vector<T> newVec(first, last);

It's an O(N) operation to construct the new vector, but there isn't really a better way.

Smoothbore answered 7/1, 2009 at 19:4 Comment(17)

+1, also it's O(Y-X), which is less than or equal to O(N) (and in his example much less) – Decare 7/1, 2009 at 21:53

@Decare Well, then it's O(N) after all. – Leatri 8/1, 2009 at 7:56

@GregRogers: It doesn't make sense to use the big-O notation where N is a specific number. Big-O communicates the rate of growth with respect to how N changes. Johann: It's best not to use one variable name in two ways. We'd normally say either O(Y-X), or we'd say O(Z) where Z=Y-X. – Spathic 11/9, 2013 at 17:51

@GregRogers By using this way, we need to declare a new vector. Is there a way to change the original vector? something like myVec(first, last)? I know this is wrong, but I really need the solution as I want to use recursion in my codes, and need to repeatedly use the same vector (although changed). Thanks! – Proverb 5/11, 2015 at 12:24

@Proverb Create a new vector and swap? – Hokanson 27/1, 2016 at 15:7

Why not just vector<T> newVec(myVec.begin() + 100000, myVec.begin() + 101000);? – Crosson 20/3, 2017 at 23:57

In addition, If newVec already exists and you want to replace it's contents use newVec.asign(first, last). It has the same semantics as the constructor in the answer. – Acetone 7/6, 2017 at 11:33

what happens if iterator last is outofbounds? – Tion 20/8, 2017 at 11:54

You can also use auto instead of vector<T>::const_iterator – Morton 14/12, 2017 at 7:50

Just for my own reference, the resulting vector is a shallow copy right? the sub-vector is not pointers to the original vectors underlying data structure right? I basically want to know that if I return back up a stackframe, that the internal pointers to the resulting sub-vector are still valid because they're separate from the original super-vector. – Multilateral 19/2, 2018 at 23:57

Is line 3 a copy? Is it possible to construct a reference subvector that way? – Salinger 5/10, 2018 at 9:56

As a side note std::vector range constructor constructs the container with the contents of the range [first, last). – Tactic 28/2, 2020 at 10:27

Great proposition. Could we use auto cbegin instead of begin ? auto first = myVec.cbegin() + 100000; auto last = myVec.cbegin() + 101000; – Nalepka 1/3, 2021 at 10:33

@Tactic that's the constructor this answer uses. – Hali 29/9, 2021 at 18:1

@Nalepka Yes. – Hali 29/9, 2021 at 18:1

@Hali yes it is the constructor used, I thought pointing out that last element is not included was relevant , hence the [x,y) convention, in fact the answer used vector.begin + ..1000 to get ...999 elements – Tactic 29/9, 2021 at 19:34

@DrDeo, i tried out of bounds last in a toy program: godbolt.org/z/87MsPqnWT It doesn't do a bounds check. Compiler :clang 14.0.0, c++2a – Flasher 6/7, 2022 at 17:20

111

Just use the vector constructor.

std::vector<int>   data();
// Load Z elements into data so that Z > Y > X

std::vector<int>   sub(&data[100000],&data[101000]);

Diaphoretic answered 7/1, 2009 at 19:14 Comment(12)

Ok, I didn't realize it was that simple to obtain an iterator from an arbitrary vector element. – Brevity 7/1, 2009 at 19:58

Well, you actually get pointers to the elements, which in most cases are equivalent to iterators. However, I assume that they won't be checked like iterators when you have a checked-iterators implementation. Could be some magic I'm unaware of though that makes that work too... – Oho 7/1, 2009 at 20:56

Taking the address of those vector elements is an unportable hack that will break if the vector storage is not in fact contiguous. Use begin() + 100000 etc. – Strickland 8/1, 2009 at 6:29

My bad, apparently the standard guarantees that vector storage is contiguous. Nevertheless it's bad practice to work with addresses like this as it is certainly not guaranteed to work for all containers supporting random access, while begin() + 100000 is. – Strickland 8/1, 2009 at 6:36

@j_random_hacker: Sorry have to disagree. The STL specification for std::vector was explicitly changed to support this type of procedure. Also a pointer is valid type of iterator. Look up iterator_traits<> – Diaphoretic 8/1, 2009 at 7:35

Yep, pointers aren't necessarily valid std::vector<T>::iterator's but they are valid iterators for an array in continuous storage, which is how it is used. – Smoothbore 8/1, 2009 at 20:4

@Uri: This will always work. But like all code it is limited by system constraints. If you try and do something that is not physically possible you will get an exception. – Diaphoretic 17/8, 2015 at 18:12

How would you slice up to the last element using that ? If you do sub(&data[100000],&data[data.size()]); That would not always work and may throw an access violation ? – Aara 30/9, 2016 at 23:19

@taktak004 Nope. Remember that operator[] returns a reference. It is only at the point where you read or write the reference that it would become an access violation. Since we do neither but instead get the address we have not invoked UB,. – Diaphoretic 1/10, 2016 at 1:24

This won't work if you need to copy data till end of source vector. – Pol 6/12, 2017 at 17:58

@bialix: It will. I assume you are talking about taking the address of one past then end being the problem. This is covered by the standard and is allowed. It is UB to dereference this addresses but not to take its address. Note the definition of X[Y] is defined as *(X + Y) in the standard. And the operators &* placed together cancel each other out (assuming they have not been overridden (which they can't be for pointers)). => &X[Y] => &*(X + Y) => (X + Y). Which is also why 10000[data] is a valid expression for array access. – Diaphoretic 6/12, 2017 at 21:35

actually MSVC 2017 does raise an access violation for the case that @taktak004 considered (although only in Debug mode and with Debuglevel 2) – Heat 17/2, 2022 at 10:4

This discussion is pretty old, but the simplest one isn't mentioned yet, with list-initialization:

 vector<int> subvector = {big_vector.begin() + 3, big_vector.end() - 2};

It requires c++11 or above.

Example usage:

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int main(){

    vector<int> big_vector = {5,12,4,6,7,8,9,9,31,1,1,5,76,78,8};
    vector<int> subvector = {big_vector.begin() + 3, big_vector.end() - 2};

    cout << "Big vector: ";
    for_each(big_vector.begin(), big_vector.end(),[](int number){cout << number << ";";});
    cout << endl << "Subvector: ";
    for_each(subvector.begin(), subvector.end(),[](int number){cout << number << ";";});
    cout << endl;
}

Result:

Big vector: 5;12;4;6;7;8;9;9;31;1;1;5;76;78;8;
Subvector: 6;7;8;9;9;31;1;1;5;76;

Omdurman answered 6/11, 2019 at 6:47 Comment(6)

it defines the subrange: Starts from the 3rd element, and goes until the 2nd from behind – Supraorbital 24/4, 2022 at 17:3

Can you please clarify whether this way includes the last element? – Reflective 29/1, 2023 at 18:32

It's inclusive, as the example shows: the last element is the one 2 positions before .end() – Supraorbital 30/1, 2023 at 20:5

@DávidTóth It's exclusive. .end() returns the iterator to the "past the last" element. – Haemoid 12/9, 2023 at 12:15

inclusive for the last element, but exclusive of the "past the element" element.. – Supraorbital 13/9, 2023 at 9:15

Not important but to avoid confusion: In vector<int> subvector = {iter1, iter2}, iter2 is exclusive. This does not change depending on your perspective. You said "the last element is the one 2 positions before .end()", but it's not. The last element(76) is the one 3 positions before .end(). Very tiny stuff but let's be careful :) – Haemoid 31/10, 2023 at 8:1

These days, we use spans! So you would write:

#include <span>

...
auto start_pos = 100000;
auto length = 1000;
auto span_of_myvec = std::make_span(myvec);
auto my_subspan = span_of_myvec.subspan(start_pos, length);

to get a span of 1000 elements of the same type as myvec's. Or a more terse form:

auto my_subspan = std::make_span(myvec).subspan(1000000, 1000);

(but I don't like this as much, since the meaning of each numeric argument is not entirely clear; and it gets worse if the length and start_pos are of the same order of magnitude.)

Anyway, remember that this is not a copy, it's just a view of the data in the vector, so be careful. If you want an actual copy, you could do:

std::vector<T> new_vec(my_subspan.cbegin(), my_subspan.cend());

Notes:

If you use a version of C++ older than C++20, you can use gsl::span - from the Guidelines Support Library. For more information, see: http://www.modernescpp.com/index.php/c-core-guideline-the-guidelines-support-library and https://github.com/martinmoene/gsl-lite for a concrete implementation
For more information about spans, see: What is a "span" and when should I use one?
std::vector has a gazillion constructors, it's super-easy to fall into one you didn't intend to use, so be careful.

Barboza answered 9/8, 2017 at 13:16 Comment(2)

would use cbegin and cend just for the principle ;) std::cbegin etc even. – Dis 29/6, 2018 at 20:16

@JHBonarius: Seeing how this code isn't templated on the choice of container, I don't see there's a particular benefit; a matter of taste I suppose. – Barboza 29/6, 2018 at 21:16

std::vector<T>(input_iterator, input_iterator), in your case foo = std::vector<T>(myVec.begin () + 100000, myVec.begin () + 150000);, see for example here

Infinity answered 7/1, 2009 at 19:4 Comment(2)

Since Andrew is trying to construct a new vector, I would recommend "std::vector foo(..." instead of copying with "foo = std::vector(..." – Swirly 7/1, 2009 at 19:34

Yeah, of course, but whether you type std::vector<int> foo = std::vector(...) or std::vector<int> foo (...) should not matter. – Infinity 7/1, 2009 at 20:6

If both are not going to be modified (no adding/deleting items - modifying existing ones is fine as long as you pay heed to threading issues), you can simply pass around data.begin() + 100000 and data.begin() + 101000, and pretend that they are the begin() and end() of a smaller vector.

Or, since vector storage is guaranteed to be contiguous, you can simply pass around a 1000 item array:

T *arrayOfT = &data[0] + 100000;
size_t arrayOfTLength = 1000;

Both these techniques take constant time, but require that the length of data doesn't increase, triggering a reallocation.

Camarena answered 7/1, 2009 at 19:26 Comment(1)

This is also good if you want the original vector and the subvector to be linked. – Loup 1/3, 2015 at 19:38

You didn't mention what type std::vector<...> myVec is, but if it's a simple type or struct/class that doesn't include pointers, and you want the best efficiency, then you can do a direct memory copy (which I think will be faster than the other answers provided). Here is a general example for std::vector<type> myVec where type in this case is int:

typedef int type; //choose your custom type/struct/class
int iFirst = 100000; //first index to copy
int iLast = 101000; //last index + 1
int iLen = iLast - iFirst;
std::vector<type> newVec;
newVec.resize(iLen); //pre-allocate the space needed to write the data directly
memcpy(&newVec[0], &myVec[iFirst], iLen*sizeof(type)); //write directly to destination buffer from source buffer

Churchill answered 16/10, 2015 at 2:3 Comment(3)

I wonder if with -O3, @Anteru's "using constructor" std::vector(myVec.begin () + 100000, myVec.begin () + 150000); , wouldn't the longer-version of this produce into exactly the same assembly? – Raeraeann 14/2, 2018 at 15:58

MSVC++ 2015, for example, compiles std::vector<>(iter, iter) to memmove(), if appropriate (if constructor is trivial, for a suitable definition of trivial). – Superabundant 26/6, 2018 at 19:18

Don't call memcpy. Do a std::copy or a constructor which accepts a range (two iterators), and the compiler and the std.library will conspire to call memcpy when appropriate. – Navelwort 12/4, 2019 at 8:38

You could just use insert

vector<type> myVec { n_elements };

vector<type> newVec;

newVec.insert(newVec.begin(), myVec.begin() + X, myVec.begin() + Y);

Avera answered 10/6, 2019 at 0:27 Comment(0)

You can use STL copy with O(M) performance when M is the size of the subvector.

Yellowish answered 7/1, 2009 at 19:3 Comment(8)

Upvoted because it pointed me in the right direction but I can see why @LokiAstari suggests it's not the correct choice - since the STL::copy works with two std::vector<T> arrays of the same size and type. Here, the OP wants to copy a subsection into a new, smaller array as outlined here in the OP's post: "0 <= X <= Y <= N-1" – Istle 31/12, 2016 at 23:22

@Andrew, see example using std::copy and std::back_inserter – Zenas 26/7, 2017 at 6:3

@LokiAstari why not? – Zenas 26/7, 2017 at 6:4

@chrisg: The clue is in the name copy. To use std::copy you need to build the destination first (which means initializing all the members to the their default value). Then you can copy them over. So its two distinct operations. Why not simply use the vector constructor and initialize the members directly to their final values. See the nice answer that has +50 votes. – Diaphoretic 26/7, 2017 at 8:3

@LokiAstari I was referring to an edit to this that did not survive peer review, which posed the example <br/> vector<T> newvec; std::copy(myvec.begin()+10000, myvec.begin() +10100, std::back_inserter(newvec)); <br/> in this case, you don't need to build the destination first, but sure, direct initialization is more... direct. – Zenas 26/7, 2017 at 19:16

@chrisg: Its also two lines. Additionally you need to stick a third line in to make sure it is efficient. newvec.reserve(10100 - 10000);. ITs definitely an option and technically it will work. But out of the two which are you going to recommend? – Diaphoretic 26/7, 2017 at 21:11

@LokiAstari ah, ok, I'm sold. The need for reserve on larger sets pushed me over the line. I also didn't like the "under the hood" feel of using naked pointers, dependency on an underlying detail (or "guarantee" that vector is contiguous), and unchecked array-like accessing, because they feel very C-like and un-C++-like, but then I decided its no worse than the iterator arithmetic used for std::copy approach – Zenas 26/7, 2017 at 22:14

@chrisg: The most up-voted answer uses standard iterators and the constructor. – Diaphoretic 27/7, 2017 at 16:0

Suppose there are two vectors.

 vector<int> vect1{1, 2, 3, 4};
 vector<int> vect2;

Method 1. Using copy function. copy(first_iterator_index, last_iterator_index, back_inserter()) :- This function takes 3 arguments, firstly, the first iterator of old vector. Secondly, the last iterator of old vector and third is back_inserter function to insert values from back.

    // Copying vector by copy function
    copy(vect1.begin(), vect1.end(), back_inserter(vect2));

Method 2. By using Assign Function. assign(first_iterator_o, last_iterator_o). This method assigns the same values to new vector as old one. This takes 2 arguments, first iterator to old vector and last iterator to old vector.

    //Copying vector by assign function
    vect2.assign(vect1.begin(), vect1.end());

Brag answered 15/11, 2021 at 2:13 Comment(0)

The only way to project a collection that is not linear time is to do so lazily, where the resulting "vector" is actually a subtype which delegates to the original collection. For example, Scala's List#subseq method create a sub-sequence in constant time. However, this only works if the collection is immutable and if the underlying language sports garbage collection.

Aphyllous answered 7/1, 2009 at 19:6 Comment(1)

in c++ way to do that would be to have vector of shared_ptr to X instead of vector of X and then copy SPs, but unfortunately I dont think that is faster because atomic operation involved with cpying SP. Or the original vector could be a const shared_ptr of vector instead and you just take reference to subrange in it. ofc you dont need to make it a shared_ptr of vector but then you have lifetime problems... all this is off top of my head, could be wrong... – Elohim 22/10, 2013 at 14:30

Maybe the array_view/span in the GSL library is a good option.

Here is also a single file implementation: array_view.

Delmerdelmor answered 27/4, 2017 at 7:36 Comment(1)

Kindly add answer here along with link. As external link might change in future – Belindabelisarius 27/4, 2017 at 8:3

Copy elements from one vector to another easily
In this example, I am using a vector of pairs to make it easy to understand
`

vector<pair<int, int> > v(n);

//we want half of elements in vector a and another half in vector b
vector<pair<lli, lli> > a(v.begin(),v.begin()+n/2);
vector<pair<lli, lli> > b(v.begin()+n/2, v.end());


//if v = [(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]
//then a = [(1, 2), (2, 3)]
//and b = [(3, 4), (4, 5), (5, 6)]

//if v = [(1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7)]
//then a = [(1, 2), (2, 3), (3, 4)]
//and b = [(4, 5), (5, 6), (6, 7)]

'
As you can see you can easily copy elements from one vector to another, if you want to copy elements from index 10 to 16 for example then we would use

vector<pair<int, int> > a(v.begin()+10, v.begin+16);

and if you want elements from index 10 to some index from end, then in that case

vector<pair<int, int> > a(v.begin()+10, v.end()-5);

hope this helps, just remember in the last case v.end()-5 > v.begin()+10

Paint answered 24/6, 2018 at 5:25 Comment(0)

Yet another option: Useful for instance when moving between a thrust::device_vector and a thrust::host_vector, where you cannot use the constructor.

std::vector<T> newVector;
newVector.reserve(1000);
std::copy_n(&vec[100000], 1000, std::back_inserter(newVector));

Should also be complexity O(N)

You could combine this with top anwer code

vector<T>::const_iterator first = myVec.begin() + 100000;
vector<T>::const_iterator last = myVec.begin() + 101000;
std::copy(first, last, std::back_inserter(newVector));

Dis answered 29/6, 2018 at 20:22 Comment(0)

vector::assign may be another solution

// note: size1 < src.size() && size2 < src.size()
std::vector<int> sub1(size1), sub2(size2);
sub1.assign(src.begin(), src.begin() + size1);
sub2.assign(src.begin(), src.begin() + size2);

Stench answered 15/9, 2022 at 10:6 Comment(0)

-2

Posting this late just for others..I bet the first coder is done by now. For simple datatypes no copy is needed, just revert to good old C code methods.

std::vector <int>   myVec;
int *p;
// Add some data here and set start, then
p=myVec.data()+start;

Then pass the pointer p and a len to anything needing a subvector.

notelen must be!! len < myVec.size()-start

Anarchist answered 18/11, 2013 at 20:6 Comment(1)

This doesn't perform a copy. – Capeskin 30/1, 2017 at 8:32

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags