vector vs. list in STL
Asked Answered
F

17

311

I noticed in Effective STL that

vector is the type of sequence that should be used by default.

What's does it mean? It seems that ignore the efficiency vector can do anything.

Could anybody offer me a scenario where vector is not a feasible option but list must be used?

Follett answered 5/2, 2010 at 17:54 Comment(4)
Though it's not what you asked, it's worth pointing out that defaulting to vector also means you can easily interact with older code, C libraries, or non-template libraries, since vector is a thin wrapper around the "traditional" dynamic array of a pointer and size.Suggest
Bjarne Strostrup actually made a test where he generated random numbers and then added them to a list and a vector respectively. The insertions were made so that the list/vector was ordered at all times. Even though this is typically "list domain" the vector outperformed the list by a LARGE margin. Reason being that memory acces is slow and caching works better for sequential data. It's all available in his keynote from "GoingNative 2012"Moyna
baptiste-wicht.com/2012/12/cpp-benchmark-vector-list-dequeUnearthly
If you want to see the keynote by Bjarne Stroustrup that @Moyna mentioned, I found it here: youtu.be/OB-bdWKwXsU?t=2672Roesch
C
121

Situations where you want to insert a lot of items into anywhere but the end of a sequence repeatedly.

Check out the complexity guarantees for each different type of container:

What are the complexity guarantees of the standard containers?

Cahill answered 5/2, 2010 at 17:56 Comment(13)
Inserting elements at the end also counts because it can lead to memory allocation and element copying costs. And also, inserting elenets at the begining of a vector is next to impossible, list has push_frontCallison
No, inserting elements at the end of a vector is amortised constant time. The memory allocations only happen on occasionally, and you can preallocate the vector to prevent it. Of course, if you MUST have guaranteed consistent constant time insertions, I guess this is still an issue.Heavyfooted
@Callison - Is the following "next to impossible" for you? v.insert(v.begin(), i)Berhley
in what application we have to insert a lot of items into the middle of a sequence repeatedly?Follett
@Manuel: Yes, it can be commanded in 20 chars or something, but where it leads to? Depending on implementation it can mean the copy of all elements every time. The easily usable slots (between size() and capacity()) are at the end, not the begining.Callison
@Callison - I agree with you, it's just that I didn't want the OP to think that the interface is not there in case one wants to shoot himself in the (performance) foot.Berhley
@skydoor: If the ordering is somehow important it might be desired, and you know the ordering beforehand. However, if you are inserting data that is not ordered, and not near-ordered a binary search tree is a better solution as the structure itself maintains the ordering, it also provides much quicker lookups than a (linked) list (as long as the above mentioned cases are not true.) If the data inserted into the tree is ordered or near ordered it more or less degenerates into a linked list and nothing is gained. That can be avoided by using a self-balancing tree, but they are more complicated.Beggary
@skydoor: And hashmaps, they don't maintain order though but usually have very fast lookup times (faster than binary search trees)Beggary
@Skurmedel: Not necessarily faster than BSTs; algorithmic complexity in the form of big-O notation is concerned with categories rather than actual speed. See various answers/comments at #2197495Suggest
@Roger Pate: Good point, thanks. And if you need to write the structure yourself I would expect a BST to be simpler to do right.Beggary
Bjarne Strostrup actually made a test where he generated random numbers and then added them to a list and a vector respectively. The insertions were made so that the list/vector was ordered at all times. Even though this is typically "list domain" the vector outperformed the list by a LARGE margin. Reason being that memory acces is slow and caching works better for sequential data. It's all available in his keynote from "GoingNative 2012"Moyna
@Moyna you mean the vector was being sorted after every insertion?Zulemazullo
@luizfls: He means the new value would be inserted into the correct place in the list. Note: there is no need to sort a list that is already sorted.Cahill
P
515
std::vector std::list
Contiguous memory. Non-contiguous memory.
Pre-allocates space for future elements, so extra space required beyond what's necessary for the elements themselves. No pre-allocated memory. The memory overhead for the list itself is constant.
Each element only requires the space for the element type itself (no extra pointers). Each element requires extra space for the node which holds the element, including pointers to the next and previous elements in the list.
Can re-allocate memory for the entire vector any time that you add an element. Never has to re-allocate memory for the whole list just because you add an element.
Insertions at the end are constant, amortized time, but insertions elsewhere are a costly O(n). Insertions and erasures are cheap no matter where in the list they occur.
Erasures at the end of the vector are constant time, but for the rest it's O(n). It's cheap to combine lists with splicing.
You can randomly access its elements. You cannot randomly access elements, so getting at a particular element in the list can be expensive.
Iterators are invalidated if you add or remove elements to or from the vector. Iterators remain valid even when you add or remove elements from the list.
You can easily get at the underlying array if you need an array of the elements. If you need an array of the elements, you'll have to create a new one and add them all to it, since there is no underlying array.

In general, use vector when you don't care what type of sequential container that you're using, but if you're doing many insertions or erasures to and from anywhere in the container other than the end, you're going to want to use list. Or if you need random access, then you're going to want vector, not list. Other than that, there are naturally instances where you're going to need one or the other based on your application, but in general, those are good guidelines.

Psalter answered 5/2, 2010 at 18:53 Comment(10)
Also, allocating from the free store is not free. :) Adding new items to a vector performs O(log n) free store allocations, but you can call reserve() to reduce that to O(1). Adding new items to a list (i.e. not splicing them) performs O(n) free store allocations.Glassful
Another consideration is that list frees memory when you erase elements, but vector does not. A vector does not reduce its capacity when you reduce its size, unless you use the swap() trick.Glassful
@bk1e: I really want to know your trick in reserve() and swap() :)Convection
@nXqd: if you need to add N elements to a vector, call v.reserve(v.size()+N) so that it performs only one free store allocation. The swap() trick is here: https://mcmap.net/q/101317/-how-to-downsize-std-vector-duplicateGlassful
Notes: vector pre-allocation: This can eat a lot of memory for large vectors if you do not know the size in advance and are not able to reserve. // vector re-allocation: This can get expensive for large-ish objects (and large vectors).Entranceway
@simplename No. What it says is correct. vector allocates extra space beyond the space for the elements currently in the vector; that extra capacity is then used to grow the vector so that growing it is amortized O(1).Psalter
@Glassful after c++11 you may call ‘std::vector::shrink_to_fit()’Nerte
vector.clear() is constant complexity for trivially-destructible types (such as scalar or PODs), where elements need not be destroyed. list.clear() is always linear complexity in the size of listBueno
+1. Also adding - Lists are said to be not Thread safe in general, whereas vectors are. So Lists would need to handled explicitly in such scenarios. References : geeksforgeeks.org/difference-between-vector-and-listRummer
@Rummer That link makes claims about thread safety and synchronisation with no evidence. Do you have any? Otherwise I see no immediately intuitive reason why the two containers would differ at all in those respects.Cranky
C
121

Situations where you want to insert a lot of items into anywhere but the end of a sequence repeatedly.

Check out the complexity guarantees for each different type of container:

What are the complexity guarantees of the standard containers?

Cahill answered 5/2, 2010 at 17:56 Comment(13)
Inserting elements at the end also counts because it can lead to memory allocation and element copying costs. And also, inserting elenets at the begining of a vector is next to impossible, list has push_frontCallison
No, inserting elements at the end of a vector is amortised constant time. The memory allocations only happen on occasionally, and you can preallocate the vector to prevent it. Of course, if you MUST have guaranteed consistent constant time insertions, I guess this is still an issue.Heavyfooted
@Callison - Is the following "next to impossible" for you? v.insert(v.begin(), i)Berhley
in what application we have to insert a lot of items into the middle of a sequence repeatedly?Follett
@Manuel: Yes, it can be commanded in 20 chars or something, but where it leads to? Depending on implementation it can mean the copy of all elements every time. The easily usable slots (between size() and capacity()) are at the end, not the begining.Callison
@Callison - I agree with you, it's just that I didn't want the OP to think that the interface is not there in case one wants to shoot himself in the (performance) foot.Berhley
@skydoor: If the ordering is somehow important it might be desired, and you know the ordering beforehand. However, if you are inserting data that is not ordered, and not near-ordered a binary search tree is a better solution as the structure itself maintains the ordering, it also provides much quicker lookups than a (linked) list (as long as the above mentioned cases are not true.) If the data inserted into the tree is ordered or near ordered it more or less degenerates into a linked list and nothing is gained. That can be avoided by using a self-balancing tree, but they are more complicated.Beggary
@skydoor: And hashmaps, they don't maintain order though but usually have very fast lookup times (faster than binary search trees)Beggary
@Skurmedel: Not necessarily faster than BSTs; algorithmic complexity in the form of big-O notation is concerned with categories rather than actual speed. See various answers/comments at #2197495Suggest
@Roger Pate: Good point, thanks. And if you need to write the structure yourself I would expect a BST to be simpler to do right.Beggary
Bjarne Strostrup actually made a test where he generated random numbers and then added them to a list and a vector respectively. The insertions were made so that the list/vector was ordered at all times. Even though this is typically "list domain" the vector outperformed the list by a LARGE margin. Reason being that memory acces is slow and caching works better for sequential data. It's all available in his keynote from "GoingNative 2012"Moyna
@Moyna you mean the vector was being sorted after every insertion?Zulemazullo
@luizfls: He means the new value would be inserted into the correct place in the list. Note: there is no need to sort a list that is already sorted.Cahill
W
42

Make it simple-
At the end of the day when you are confused choosing containers in C++ use this flow chart image ( Say thanks to me ) :-

enter image description here

Vector-

  1. vector is based on contagious memory
  2. vector is way to go for small data-set
  3. vector perform fastest while traversing on data-set
  4. vector insertion deletion is slow on huge data-set but fast for very small

List-

  1. list is based on heap memory
  2. list is way to go for very huge data-set
  3. list is comparatively slow on traversing small data-set but fast at huge data-set
  4. list insertion deletion is fast on huge data-set but slow on smaller ones
Wishywashy answered 30/10, 2019 at 7:53 Comment(2)
RE list 4. You should mention it's comparitively fast/slow. For a list it does not make a difference in speed how many elements there are.Ferrante
Vector contents are on heap (unless you do vector<Type*> vect with the pointers stored in heap but point to things on the stack). Why can't vector be used for very huge data set?Rees
D
41

If you don't need to insert elements often then a vector will be more efficient. It has much better CPU cache locality than a list. In other words, accessing one element makes it very likely that the next element is present in the cache and can be retrieved without having to read slow RAM.

Dismay answered 5/2, 2010 at 18:5 Comment(0)
C
40

Most answers here miss one important detail: what for?

What do you want to keep in the containter?

If it is a collection of ints, then std::list will lose in every scenario, regardless if you can reallocate, you only remove from the front, etc. Lists are slower to traverse, every insertion costs you an interaction with the allocator. It would be extremely hard to prepare an example, where list<int> beats vector<int>. And even then, deque<int> may be better or close, not justyfing the use of lists, which will have greater memory overhead.

However, if you are dealing with large, ugly blobs of data - and few of them - you don't want to overallocate when inserting, and copying due to reallocation would be a disaster - then you may, maybe, be better off with a list<UglyBlob> than vector<UglyBlob>.

Still, if you switch to vector<UglyBlob*> or even vector<shared_ptr<UglyBlob> >, again - list will lag behind.

So, access pattern, target element count etc. still affects the comparison, but in my view - the elements size - cost of copying etc.

Cb answered 27/3, 2013 at 14:33 Comment(3)
One more reflection, which I had when reading "Effective STL" by Meyers: a peculiar property of list<T> is the possibility to splice in O(1). If you need constant-time splicing, list may also be the structure of choice ;)Cb
+1 - It doesn't even have to be an UglyBlob -- even an objects with just a few string members can easily be prohibitively expensive to copy, so the reallocations will cost. Also: Don't neglect the space overhead the exponential growth of a vector holding objects of a few dozen bytes in size can cause (if you can't reserve in advance).Entranceway
As for vector<smart_ptr<Large>> vs. list<Large> -- I'd say, if you need random access to the elements, the vector makes sense. If you do not need random access, the list seems simpler and should perform equally.Entranceway
C
22

One special capability of std::list is splicing (linking or moving part of or a whole list into a different list).

Or perhaps if your contents are very expensive to copy. In such a case it might be cheaper, for example, to sort the collection with a list.

Also note that if the collection is small (and the contents are not particularly expensive to copy), a vector might still outperform a list, even if you insert and erase anywhere. A list allocates each node individually, and that might be much more costly than moving a few simple objects around.

I don't think there are very hard rules. It depends on what you mostly want to do with the container, as well as on how large you expect the container to be and the contained type. A vector generally trumps a list, because it allocates its contents as a single contiguous block (it is basically a dynamically allocated array, and in most circumstances an array is the most efficient way to hold a bunch of things).

Conjuncture answered 5/2, 2010 at 18:12 Comment(5)
+1. Splicing is over overlooked, but unfortunately isn't constant-time as desired. :(( (It can't be if list::size is constant-time.)Suggest
I'm quite sure list::size is (allowed to be) linear for this very reason.Conjuncture
@Roger: list::size is not necessarily constant time. See https://mcmap.net/q/101318/-is-list-size-really-o-n and gcc.gnu.org/ml/libstdc++/2005-11/msg00219.htmlRevile
@Potatoswatter: That the standard is vague and that you consequently can't rely on "compliant" implementations makes it even more of a problem. You literally have to avoid the stdlib to get a portable and dependable guarantee.Suggest
@Roger: yes, unfortunately. My current project strongly relies on splice operations and that structure is almost straight C. Even more unfortunately, in N3000 sequence splice between different lists is specified as linear complexity and size is specifically constant. So, to accommodate novices who iterate on size or whatever, a whole class of algorithms is out of reach for the STL, or any "compliant" containers period.Revile
G
15

Well the students of my class seems quite unable to explain to me when it is more effective to use vectors, but they look quite happy when advising me to use lists.

This is how I understand it

Lists: Each item contains an address to the next or previous element, so with this feature, you can randomize the items, even if they aren't sorted, the order won't change: it's efficient if you memory is fragmented. But it also has an other very big advantage: you can easily insert/remove items, because the only thing you need to do is change some pointers. Drawback: To read a random single item, you have to jump from one item to another until you find the correct address.

Vectors: When using vectors, the memory is much more organized like regular arrays: each n-th items is stored just after (n-1)th item and before (n+1)th item. Why is it better than list ? Because it allow fast random access. Here is how: if you know the size of an item in a vector, and if they are contiguous in memory, you can easily predict where the n-th item is; you don't have to browse all the item of a list to read the one you want, with vector, you directly read it, with a list you can't. On the other hand, modify the vector array or change a value is much more slow.

Lists are more appropriate to keep track of objects which can be added/removed in memory. Vectors are more appropriate when you want to access an element from a big quantity of single items.

I don't know how lists are optimized, but you have to know that if you want fast read access, you should use vectors, because how good the STL fasten lists, it won't be as fast in read-access than vector.

Glade answered 27/10, 2010 at 15:49 Comment(1)
"modify the vector array or change a value is much more slow" - as read, this seems to contradict what you said just before about vector being predisposed for good performance due to their low-level and contiguous nature. did you mean that reallocating the vector caused by changing its size can be slow? then agreed, but in cases where reserve() can be used, that avoids those problems.Cranky
M
11

Any time you cannot have iterators invalidated.

Magog answered 5/2, 2010 at 17:57 Comment(1)
But never jump to that conclusion about iterators without asking whether persistent references into a deque would suffice.Revile
C
10

Basically a vector is an array with automatic memory management. The data is contiguous in memory. Trying to insert data in the middle is a costly operation.

In a list the data is stored in unrelated memory locations. Inserting in the middle doesn't involve copying some of the data to make room for the new one.

To answer more specifically your question I'll quote this page

vectors are generally the most efficient in time for accessing elements and to add or remove elements from the end of the sequence. For operations that involve inserting or removing elements at positions other than the end, they perform worse than deques and lists, and have less consistent iterators and references than lists.

Corruptible answered 5/2, 2010 at 18:11 Comment(0)
S
9

Summarizing the answers in a table for quick reference:

Vector List
Access Faster Slower
Insert/Delete Operations Slower Faster
Memory Allocation Contiguous Non-contiguous
Size Pre-allocation Need to be reserved Not necessary to reserve
Space Required Per Element Only for the element itself For element and pointers to next
(and optionally previous elements)
Saliferous answered 8/5, 2021 at 11:53 Comment(0)
T
7

When you have a lot of insertion or deletion in the middle of the sequence. e.g. a memory manager.

Tennies answered 5/2, 2010 at 17:56 Comment(4)
so the difference between them is just efficiency, not about functional issue.Follett
They both model a sequence of elements of course. There is a small difference in usage, as mentioned by @Magog but you have to look at the complexity of your "often done" operations to decide which sequence to choose(@Martin answer).Tennies
@Follett - There are a few functional differences. For example, only vector supports random-access (i.e., can be indexed).Berhley
@skydoor: efficiency translates to performance. Poor performance can ruin functionality. Performance is the advantage of C++, after all.Revile
E
5

In the case of a vector and list, the main differences that stick out to me are the following:

vector

  • A vector stores its elements in contiguous memory. Therefore, random access is possible inside a vector which means that accessing an element of a vector is very fast because we can simply multiply the base address with the item index to access that element. In fact, it takes only O(1) or constant time for this purpose.

  • Since a vector basically wraps an array, every time you insert an element into the vector (dynamic array), it has to resize itself by finding a new contiguous block of memory to accommodate the new elements which is time-costly.

  • It does not consume extra memory to store any pointers to other elements within it.

list

  • A list stores its elements in non-contiguous memory. Therefore, random access is not possible inside a list which means that to access its elements we have to use the pointers and traverse the list which is slower relative to vector. This takes O(n) or linear time which is slower than O(1).

  • Since a list uses non-contiguous memory, the time taken to insert an element inside a list is a lot more efficient than in the case of its vector counterpart because reallocation of memory is avoided.

  • It consumes extra memory to store pointers to the element before and after a particular element.

So, keeping these differences in mind, we usually consider memory, frequent random access and insertion to decide the winner of vector vs list in a given scenario.

Emmi answered 30/10, 2019 at 7:36 Comment(0)
C
4

Preserving the validity of iterators is one reason to use a list. Another is when you don't want a vector to reallocate when pushing items. This can be managed by an intelligent use of reserve(), but in some cases it might be easier or more feasible to just use a list.

Carcanet answered 5/2, 2010 at 18:5 Comment(0)
R
4

When you want to move objects between containers, you can use list::splice.

For example, a graph partitioning algorithm may have a constant number of objects recursively divided among an increasing number of containers. The objects should be initialized once and always remain at the same locations in memory. It's much faster to rearrange them by relinking than by reallocating.

Edit: as libraries prepare to implement C++0x, the general case of splicing a subsequence into a list is becoming linear complexity with the length of the sequence. This is because splice (now) needs to iterate over the sequence to count the number of elements in it. (Because the list needs to record its size.) Simply counting and re-linking the list is still faster than any alternative, and splicing an entire list or a single element are special cases with constant complexity. But, if you have long sequences to splice, you might have to dig around for a better, old-fashioned, non-compliant container.

Revile answered 5/2, 2010 at 18:22 Comment(0)
D
3

The only hard rule where list must be used is where you need to distribute pointers to elements of the container.

Unlike with vector, you know that the memory of elements won't be reallocated. If it could be then you might have pointers to unused memory, which is at best a big no-no and at worst a SEGFAULT.

(Technically a vector of *_ptr would also work but in that case you are emulating list so that's just semantics.)

Other soft rules have to do with the possible performance issues of inserting elements into the middle of a container, whereupon list would be preferable.

Dennison answered 6/10, 2016 at 7:1 Comment(0)
S
1

Lists are just a wrapper for a doubly-LinkedList in stl, thus offering feature you might expect from d-linklist namely O(1) insertion and deletion. While vectors are contagious data sequence which works like a dynamic array.P.S- easier to traverse.

Syndicate answered 3/10, 2019 at 12:23 Comment(0)
F
0

List is Doubly Linked List so it is easy to insert and delete an element. We have to just change the few pointers, whereas in vector if we want to insert an element in the middle then each element after it has to shift by one index. Also if the size of the vector is full then it has to first increase its size. So it is an expensive operation. So wherever insertion and deletion operations are required to be performed more often in such a case list should be used.

Farrow answered 1/5, 2020 at 10:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.