Is list better than vector when we need to store "the last n items"?
Asked Answered
C

10

44

There are a lot of questions which suggest that one should always use a vector, but it seems to me that a list would be better for the scenario, where we need to store "the last n items"

For example, say we need to store the last 5 items seen: Iteration 0:

3,24,51,62,37,

Then at each iteration, the item at index 0 is removed, and the new item is added at the end:

Iteration 1:

24,51,62,37,8

Iteration 2:

51,62,37,8,12

It seems that for this use case, for a vector the complexity will be O(n), since we would have to copy n items, but in a list, it should be O(1), since we are always just chopping off the head, and adding to the tail each iteration.

Is my understanding correct? Is this the actual behaviour of an std::list ?

Christachristabel answered 18/5, 2017 at 5:1 Comment(2)
Check this out... it explains when to choose which one: https://mcmap.net/q/99207/-vector-vs-list-in-stlStratosphere
This talk by Bjarne Stroustrup is useful: youtube.com/watch?v=YQs6IC-vgmoBaylor
H
95

Neither. Your collection has a fixed size and std::array is sufficient.

The data structure you implement is called a ring buffer. To implement it you create an array and keep track of the offset of the current first element.

When you add an element that would push an item out of the buffer - i.e. when you remove the first element - you increment the offset.

To fetch elements in the buffer you add the index and the offset and take the modulo of this and the length of the buffer.

Harrelson answered 18/5, 2017 at 8:0 Comment(10)
+1 A circular buffer is a completely ideal implementation for a queue that has fixed maximum size (unfortunately there is no standard library container for this).Amr
Although I can implement one myself, I think this answer would be more useful to others if there was a simple implementation included.Christachristabel
@KaizerSozay you can always suggest an edit with your implementation if you wantJason
@NieDzejkob yeah I haven't done it yet, as I am testing using a deque, so it will be a while before I get back to this, but I Imagine this is the correct answer. Although you have to implement it yourself, I imagine performance wise you can't go wrong here.Christachristabel
@KaizerSozay: Use boost::circular_buffer.Airs
Btw, in typical implementations, does the index ever get reset to 0? And if not, doesn't that risk overflow in long running programs?Protohistory
The index is reset in the last line, but it could maybe use more emphasis as it's important: "add the index and the offset and take the modulo of this and the length of the buffer".Outgeneral
@DmitryNarkevich As user3067860 already mentioned, the index will be reset. That is in fact the whole point of a circular buffer. The memory is static, so you move your view of the memory to make it look like it was an actual circle.Aminaamine
@DmitryNarkevich The index is passed into the class when you do a lookup - so there is no context in which it makes sense to reset it. The offset should be reset - but because you do a modulo whenever you make a lookup failure to reset shoul not cause a buffer overflow - if you don't reset you could hit undefined behavior when the datatype overflows.Harrelson
@Outgeneral Oh I see, I understood that part as the modulo only being taken while doing lookups and not being stored in the index again. ThanksProtohistory
B
33

std::deque is a far better option. Or if you had benchmarked std::deque and found its performance to be inadequate for your specific use, you could implement a circular buffer in a fixed size array, storing the index of the start of the buffer. When replacing an element in the buffer, you would overwrite the element at the start index, and then set the start index to its previous value plus one modulo the size of the buffer.

List traversal is very slow, as list elements can be scattered throughout memory, and vector shifting is actually surprisingly fast, as memory moves on a single block of memory are quite fast even if it is a large block.

The talk Taming The Performance Beast from the Meeting C++ 2015 conference might be of interest to you.

Byng answered 18/5, 2017 at 5:8 Comment(23)
Why is list traversal slow ? If each node has a pointer to the next node, shouldn't it be fast regardless of where in memory it is ?Christachristabel
Locality of reference. Arrays (and therefore vectors) have it; linked lists don't. When you access an element in an array, adjacent elements get pulled into the CPU cache along with it, so they can be accessed more quickly. Linked lists are more likely to cause cache misses.Teratology
@Teratology then do deques have this ?Christachristabel
@KaizerSozay : No, std::deque does not store it's contents in a contiguous block like std::vector. Frankly, I can't remember the last time I used anything other than std::vector for simple collections.Boyse
Yes, std::deque uses arrays — though more than one. Quoting cppreference: "As opposed to std::vector, the elements of a deque are not stored contiguously: typical implementations use a sequence of individually allocated fixed-size arrays". That's still a lot better than having every individual item allocated separately in a different place.Teratology
I am confused then. For keeping track of the last n items (and iterating through them), would I be better off with a vector, list or deque. It seems one has to weigh the costs of vector shifting, vs list traversal vs deque traversal which seems to be implementation specific, which further complicates this because I am developing cross platform...Christachristabel
@KaizerSozay You would do well to make some performance tests with the different structures. Personally I think the vector would win for something like this.Baylor
@Baylor Yes, I should performance test with different structures, but it seems like a pain, because I would have to run each of the tests on different devices too (cross platform). I would have thought that having a standard would help avoid this problem..Christachristabel
vector spends time shifting elements up as you remove them. list is cache unfriendly, resulting in more accesses form slower memory. deque uses multiple non-contiguous blocks of contiguous memory, so most of the time it has good cache usage and does no shifting, but it will miss more often and it can really hurt when it allocates a new block. Over all, vector should win on a modern processor unless N is large resulting in a grievous amount of shifting.Scraggy
@KaizerSozay Inmy experience its hard to beat a vector. A deque can give the best of both worlds between a vector and a list but they cal also give the worst of both worlds. So it's highly dependent on your use pattern. But like I said, my money in on the pure vector.Baylor
@Scraggy In this case we are inserting elements into the middle of the deque so not only is it going to be shifting but it is going to be shifting over block boundaries. So I suspect, for this use, the deque may prove to be less efficient than a vector.Baylor
Holy carp. I missed the middle insertion. Yeah. deque fails and so does circular buffer. Vector it is.Scraggy
Wait, no. @Baylor There's no middle insert in OP's question. The way I'm reading it it's always adding to the end.Scraggy
@Scraggy well there is an insertion at position 4. I assume that's non literal and the actual position could be much larger. But if it is literal then a vector is still better because the number of data is tinyBaylor
@Scraggy Oh I see, position 4 is the other end!!Baylor
@Scraggy Then its even stevens which would be more efficient depending on the actual size of the structure. As you get larger the deque would seem more attractiveBaylor
five elements, vector wins hands down. 5000... Not so sure. A circular buffer should always win.Scraggy
I think there is some confusion with my wording - there is no middle insertion. What I meant is, that I am always looking at only n elements. So I always delete the first element, when i add my n+1th element to make sure I am always looking at only the last n elements.Christachristabel
It's very difficult to make any solid claims, without benchmarking on the systems you will be running on using the volume of data you would actually be using. That said, a circular buffer will almost certainly be fastest, but requires you to implement it yourself. List will certainly be slowest. If you don't want to implement a circular buffer and don't want to benchmark the different options, then I would take a guess and use a vector if your buffer size is less than a few kilobytes, or a deque if larger than that.Byng
@DavidScarlett: No need to implement a circular buffer yourself, Boost has it.Wastebasket
True, but not everyone uses Boost.Byng
@Wastebasket Yeah I prefer just sticking to the STL.Christachristabel
A std::deque is implemented (at least when using the clang or gcc toolchains) as a circular buffer backed by a vector - so using a deque is the best of all worlds in this case - cache friendly and O(1).Ramses
A
25

If you can use Boost, try boost::circular_buffer:

Boost Circular Buffer

It's a kind of sequence similar to std::list or std::deque. It supports random access iterators, constant time insert and erase operations at the beginning or the end of the buffer and interoperability with std algorithms.

It provides fixed capacity storage: when the buffer is filled, new data is written starting at the beginning of the buffer and overwriting the old

// Create a circular buffer with a capacity for 5 integers.
boost::circular_buffer<int> cb(5);

// Insert elements into the buffer.
cb.push_back(3);
cb.push_back(24);
cb.push_back(51);
cb.push_back(62);
cb.push_back(37);

int a = cb[0];  // a == 3
int b = cb[1];  // b == 24
int c = cb[2];  // c == 51

// The buffer is full now, so pushing subsequent
// elements will overwrite the front-most elements.
cb.push_back(8);   // overwrite 3 with 8
cb.push_back(12);  // overwrite 24 with 12

// The buffer now contains 51, 62, 37, 8, 12.

// Elements can be popped from either the front or the back.
cb.pop_back();  // 12 is removed
cb.pop_front(); // 51 is removed

The circular_buffer stores its elements in a contiguous region of memory, which then enables fast constant-time insertion, removal and random access of elements.


PS ... or implement the circular buffer directly as suggested by Taemyr.

Overload Journal #50 - Aug 2002 has a nice introduction (by Pete Goodliffe) to writing robust STL-like circular buffer.

Amr answered 18/5, 2017 at 8:37 Comment(1)
which then enables fast constant-time insertion, removal of elements Only at either end, otherwise it's linear.Adenoid
T
5

The problem is that O(n) only talks about the asymptotic behaviour as n tends to infinity. If n is small then the constant factors involved become significant. The result is that for "last 5 integer items" I would be stunned if vector didn't beat list. I would even expect std::vector to beat std::deque.

For "last 500 integer items" I would still expect std::vector to be faster than std::list - but std::deque would now probably win. For "last 5 million slow-to-copy items", std:vector would be slowest of all.

A ring buffer based on std::array or std::vector would probably be faster still though.

As (almost) always with performance issues:

  • encapsulate with a fixed interface
  • write the simplest code that can implement that interface
  • if profiling shows you have a problem, optimize (which will make the code more complicated).

In practise, just using a std::deque, or a pre-built ring-buffer if you have one, will be good enough. (But it's not worth going to the trouble of writing a ring buffer unless profiling says you need to.)

Tuna answered 19/5, 2017 at 9:7 Comment(0)
B
3

If you need to store last N-elements then logically you are doing some kind of queue or a circular buffer, std::stack and std::deque are implementations of LIFO and FIFO queues.

You can use boost::circular_buffer or implement simple circular buffer manually:

template<int Capcity>
class cbuffer
{
public:
    cbuffer() : sz(0), p(0){}
    void push_back(int n)
    {
        buf[p++] = n;
        if (sz < Capcity)
            sz++;
        if (p >= Capcity)
            p = 0;
    }
    int size() const
    {
        return sz;
    }
    int operator[](int n) const
    {
        assert(n < sz);
        n = p - sz + n;
        if (n < 0)
            n += Capcity;
        return buf[n];
    }
    int buf[Capcity];
    int sz, p;
};

Sample use for circular buffer of 5 int elements:

int main()
{
    cbuffer<5> buf;

    // insert random 100 numbers
    for (int i = 0; i < 100; ++i)
        buf.push_back(rand());

    // output to cout contents of the circular buffer
    for (int i = 0; i < buf.size(); ++i)
        cout << buf[i] << ' ';
}

As a note, keep in mind that when you have only 5 elements the best solution is the one that's fast to implement and works correctly.

Bagel answered 18/5, 2017 at 5:59 Comment(0)
C
3

Here is a minimal circular buffer. I'm primarily posting that here to get a metric ton of comments and ideas of improvement.

Minimal Implementation

#include <iterator>

template<typename Container>
class CircularBuffer
{
public:
    using iterator   = typename Container::iterator;
    using value_type = typename Container::value_type;
private:
    Container _container;
    iterator  _pos;
public:
    CircularBuffer() : _pos(std::begin(_container)) {}
public:
    value_type& operator*() const { return *_pos; }
    CircularBuffer& operator++() { ++_pos ; if (_pos == std::end(_container)) _pos = std::begin(_container); return *this; }
    CircularBuffer& operator--() { if (_pos == std::begin(_container)) _pos = std::end(_container); --_pos; return *this; }
};

Usage

#include <iostream>
#include <array>

int main()
{
    CircularBuffer<std::array<int,5>> buf;

    *buf = 1; ++buf;
    *buf = 2; ++buf;
    *buf = 3; ++buf;
    *buf = 4; ++buf;
    *buf = 5; ++buf;
    std::cout << *buf << " "; ++buf;
    std::cout << *buf << " "; ++buf;
    std::cout << *buf << " "; ++buf;
    std::cout << *buf << " "; ++buf;
    std::cout << *buf << " "; ++buf;
    std::cout << *buf << " "; ++buf;
    std::cout << *buf << " "; ++buf;
    std::cout << *buf << " "; --buf;
    std::cout << *buf << " "; --buf;
    std::cout << *buf << " "; --buf;
    std::cout << *buf << " "; --buf;
    std::cout << *buf << " "; --buf;
    std::cout << *buf << " "; --buf;

    std::cout << std::endl;
}

Compile with

g++ -std=c++17 -O2 -Wall -Wextra -pedantic -Werror

Demo

On Coliru: try it online

Cachet answered 18/5, 2017 at 13:17 Comment(1)
If you're really wanting those comments and ideas, you might want to ask for critique over at Code Review. Be sure to read A guide to Code Review for Stack Overflow users first, as some things are done differently over there!Annamariaannamarie
H
2

Yes. Time complexity of the std::vector for removing elements from the end is linear. std::deque might be a good choice for what you are doing as it offers constant time insertion and removal at the beginning as well as at the end of the list and also better performance than std::list

Source:

http://www.sgi.com/tech/stl/Vector.html

http://www.sgi.com/tech/stl/Deque.html

Hydrogenate answered 18/5, 2017 at 5:32 Comment(6)
Since the vector size is fixed at 5, any operation is O(5) = constant time.Wastebasket
Yes. But the OP just used n = 5 as an example, that need not mean he/she will always use 5 elements.Hydrogenate
Doesn't matter, O(6) is also constant time. The point is that the number of items is bounded.Wastebasket
@MSalters: that's not how big-O notation works. Your line of thought reduces to "Since no integer N is infinitely large, therefore O(N) = O(1)." Codelyzer's point is that removing from the front of a vector (and moving the other N-1 elements forward) is O(N) in the length of the vector — which is absolutely 100% true.Text
You might counter: "Ah, but my vector has only bounded length — five!" Sure. But someone else might have a vector of length six; or length 100; or length 10000; and the amount of time it takes to pop_front on those vectors goes up with the length of the vector — as the vector gets longer, the number of operations involved in pop_front goes up linearly. This is what we mean by "O(N)".Text
@Quuxplusone: I know how big-O notation works (and Big-Omega etc). An O(N) algorithm has a running time less than c*N, as N goes to infinity. Mathematically, this is known as a "limit". The fundamental point remains that bounded numbers do not go to infinity.Wastebasket
S
2

Here are the beginnings of a ring buffer based dequeue template class that I wrote a while ago, mostly to experiment with using std::allocator (so it does not require T to be default constructible). Note it currently doesn't have iterators, or insert/remove, copy/move constructors, etc.

#ifndef RING_DEQUEUE_H
#define RING_DEQUEUE_H

#include <memory>
#include <type_traits>
#include <limits>

template <typename T, size_t N>
class ring_dequeue {
private:
    static_assert(N <= std::numeric_limits<size_t>::max() / 2 &&
                  N <= std::numeric_limits<size_t>::max() / sizeof(T),
                  "size of ring_dequeue is too large");

    using alloc_traits = std::allocator_traits<std::allocator<T>>;

public:
    using value_type = T;
    using reference = T&;
    using const_reference = const T&;
    using difference_type = ssize_t;
    using size_type = size_t;

    ring_dequeue() = default;

    // Disable copy and move constructors for now - if iterators are
    // implemented later, then those could be delegated to the InputIterator
    // constructor below (using the std::move_iterator adaptor for the move
    // constructor case).
    ring_dequeue(const ring_dequeue&) = delete;
    ring_dequeue(ring_dequeue&&) = delete;
    ring_dequeue& operator=(const ring_dequeue&) = delete;
    ring_dequeue& operator=(ring_dequeue&&) = delete;

    template <typename InputIterator>
    ring_dequeue(InputIterator begin, InputIterator end) {
        while (m_tailIndex < N && begin != end) {
            alloc_traits::construct(m_alloc, reinterpret_cast<T*>(m_buf) + m_tailIndex,
                                    *begin);
            ++m_tailIndex;
            ++begin;
        }
        if (begin != end)
            throw std::logic_error("Input range too long");
    }

    ring_dequeue(std::initializer_list<T> il) :
        ring_dequeue(il.begin(), il.end()) { }

    ~ring_dequeue() noexcept(std::is_nothrow_destructible<T>::value) {
        while (m_headIndex < m_tailIndex) {
            alloc_traits::destroy(m_alloc, elemPtr(m_headIndex));
            m_headIndex++;
        }
    }

    size_t size() const {
        return m_tailIndex - m_headIndex;
    }
    size_t max_size() const {
        return N;
    }

    bool empty() const {
        return m_headIndex == m_tailIndex;
    }
    bool full() const {
        return m_headIndex + N == m_tailIndex;
    }

    template <typename... Args>
    void emplace_front(Args&&... args) {
        if (full())
            throw std::logic_error("ring_dequeue full");
        bool wasAtZero = (m_headIndex == 0);
        auto newHeadIndex = wasAtZero ? (N - 1) : (m_headIndex - 1);
        alloc_traits::construct(m_alloc, elemPtr(newHeadIndex),
                                std::forward<Args>(args)...);
        m_headIndex = newHeadIndex;
        if (wasAtZero)
            m_tailIndex += N;
    }
    void push_front(const T& x) {
        emplace_front(x);
    }
    void push_front(T&& x) {
        emplace_front(std::move(x));
    }

    template <typename... Args>
    void emplace_back(Args&&... args) {
        if (full())
            throw std::logic_error("ring_dequeue full");
        alloc_traits::construct(m_alloc, elemPtr(m_tailIndex),
                                std::forward<Args>(args)...);
        ++m_tailIndex;
    }
    void push_back(const T& x) {
        emplace_back(x);
    }
    void push_back(T&& x) {
        emplace_back(std::move(x));
    }

    T& front() {
        if (empty())
            throw std::logic_error("ring_dequeue empty");
        return *elemPtr(m_headIndex);
    }
    const T& front() const {
        if (empty())
            throw std::logic_error("ring_dequeue empty");
        return *elemPtr(m_headIndex);
    }
    void remove_front() {
        if (empty())
            throw std::logic_error("ring_dequeue empty");
        alloc_traits::destroy(m_alloc, elemPtr(m_headIndex));
        ++m_headIndex;
        if (m_headIndex == N) {
            m_headIndex = 0;
            m_tailIndex -= N;
        }
    }
    T pop_front() {
        T result = std::move(front());
        remove_front();
        return result;
    }

    T& back() {
        if (empty())
            throw std::logic_error("ring_dequeue empty");
        return *elemPtr(m_tailIndex - 1);
    }
    const T& back() const {
        if (empty())
            throw std::logic_error("ring_dequeue empty");
        return *elemPtr(m_tailIndex - 1);
    }
    void remove_back() {
        if (empty())
            throw std::logic_error("ring_dequeue empty");
        alloc_traits::destroy(m_alloc, elemPtr(m_tailIndex - 1));
        --m_tailIndex;
    }
    T pop_back() {
        T result = std::move(back());
        remove_back();
        return result;
    }

private:
    alignas(T) char m_buf[N * sizeof(T)];
    size_t m_headIndex = 0;
    size_t m_tailIndex = 0;
    std::allocator<T> m_alloc;

    const T* elemPtr(size_t index) const {
        if (index >= N)
            index -= N;
        return reinterpret_cast<const T*>(m_buf) + index;
    }
    T* elemPtr(size_t index) {
        if (index >= N)
            index -= N;
        return reinterpret_cast<T*>(m_buf) + index;
    }
};

#endif
Soothe answered 18/5, 2017 at 16:31 Comment(1)
Anybody, feel free to incorporate this into your answers - this was mostly posted as a response to Taemyr's answer and @KaizerSozay's comment on it suggesting to add a sample implementation.Soothe
E
1

Briefly say the std::vector is better for a non-change size of memory.In your case,if you move all data forward or append new data in a vector,that must be a waste.As @David said the std::deque is a good option,since you would pop_head and push_back eg. two way list.

from the cplus cplus reference about the list

Compared to other base standard sequence containers (array, vector and deque), lists perform generally better in inserting, extracting and moving elements in any position within the container for which an iterator has already been obtained, and therefore also in algorithms that make intensive use of these, like sorting algorithms.

The main drawback of lists and forward_lists compared to these other sequence containers is that they lack direct access to the elements by their position; For example, to access the sixth element in a list, one has to iterate from a known position (like the beginning or the end) to that position, which takes linear time in the distance between these. They also consume some extra memory to keep the linking information associated to each element (which may be an important factor for large lists of small-sized elements).

about deque

For operations that involve frequent insertion or removals of elements at positions other than the beginning or the end, deques perform worse and have less consistent iterators and references than lists and forward lists.

vetor

Therefore, compared to arrays, vectors consume more memory in exchange for the ability to manage storage and grow dynamically in an efficient way.

Compared to the other dynamic sequence containers (deques, lists and forward_lists), vectors are very efficient accessing its elements (just like arrays) and relatively efficient adding or removing elements from its end. For operations that involve inserting or removing elements at positions other than the end, they perform worse than the others, and have less consistent iterators and references than lists and forward_lists.

Egwan answered 18/5, 2017 at 5:19 Comment(4)
Actually std::list is a double-linked list, so it is equally good from the front and back.Behrens
@ZanLynx Why have a deque then ?Christachristabel
@KaizerSozay: The std::deque holds larger blocks and makes them look like a both a vector and a list. It is like a list of vectors. It's performance is closer to vector than a list.Behrens
@KaizerSozay: deque has random access, list does not.Wastebasket
R
-2

I think even use std::deque it also have overhead of copy items in certain condition because std::deque is a map of arrays essentially, so std::list is a good idea to eliminate the copy overhead.

To increase the performance of traverse for std::list, you can implement a memory pool so that the std::list will allocate memory from a trunk and it's spatial locality for caching.

Ropedancer answered 18/5, 2017 at 7:22 Comment(1)
Map? It's a list if anything. What would it map, i.e. what would be the key?Wray

© 2022 - 2024 — McMap. All rights reserved.