E

17

67

In terms of performance, what would work faster? Is there a difference? Is it platform dependent?

//1. Using vector<string>::iterator:
vector<string> vs = GetVector();

for(vector<string>::iterator it = vs.begin(); it != vs.end(); ++it)
{
   *it = "Am I faster?";
}

//2. Using size_t index:
for(size_t i = 0; i < vs.size(); ++i)
{
   //One option:
   vs.at(i) = "Am I faster?";
   //Another option:
   vs[i] = "Am I faster?";
}

Espouse answered 22/4, 2009 at 10:55 Comment(4)

I have been doing benchmarks myself, and vector.at is much slower than using an iterator, however using vector[i] is much faster than using an iterator. However, you can make the loop even faster by grabbing the pointer to the first element and looping while the current pointer is less than or equal to the pointer of the last element; similar to iterators, but less overhead and is consequently not as nice to look at code-wise. This test was done on Windows with Visual Studio 2008. Concerning your question, I do believe that's platform dependent, it depends on the implementation. – Autoerotism 27/2, 2012 at 17:35

However, continuing from my off topic point about iterating the pointers yourself, should always be faster no matter the platform. – Autoerotism 27/2, 2012 at 17:36

@leetNightshade: Certain compilers, when running into subscripts instead of a pointer arithmetics, could use SIMD instructions, which would make it faster. – Crumpet 13/4, 2013 at 3:14

You are instantiating the end iterator every time you loop, and iterator instantiation aren't free. Try caching your end iterator. Try this: for(vector<int>::iterator it = v.begin(), end= v.end(); it != end; ++it) { ... } – Apeman 15/3, 2015 at 12:40

G

30

Why not write a test and find out?

Edit: My bad - I thought I was timing the optimised version but wasn't. On my machine, compiled with g++ -O2, the iterator version is slightly slower than the operator[] version, but probably not significantly so.

#include <vector>
#include <iostream>
#include <ctime>
using namespace std;

int main() {
    const int BIG = 20000000;
    vector <int> v;
    for ( int i = 0; i < BIG; i++ ) {
        v.push_back( i );
    }

    int now = time(0);
    cout << "start" << endl;
    int n = 0;
    for(vector<int>::iterator it = v.begin(); it != v.end(); ++it) {
        n += *it;
    }

    cout << time(0) - now << endl;
    now = time(0);
    for(size_t i = 0; i < v.size(); ++i) {
        n += v[i];
    }
    cout << time(0) - now << endl;

    return n != 0;
}

Goddard answered 22/4, 2009 at 11:9 Comment(9)

Did you test with full optimisation and try it with both the iterator version first and with the array version first? There may be a slight difference in performance but 2x? Not a chance. – Evangelin 22/4, 2009 at 11:17

You'll get better performance measurements by using clock() rather than time(), or use whatever high-resolution timer your OS kernel provides. – Dunigan 22/4, 2009 at 12:18

I'm not really all thet interested. Higher resolution is a bit meaningless when you consider all the other activities that can be going on in a modern OS while your code runs. But feel free to edit my answer to incorportate your suggestions. – Goddard 22/4, 2009 at 12:36

in my tests (using "time" shell builtin and all cout's disabled and one test commented out each time) both versions are equally fast (changed the code so it allocates in the constructor, each element has value "2"). actually the time changes in each test with around 10ms, which i suspect is because of the non-determinism of memory allocation. and sometimes the one, and sometimes the other test is 10ms faster than the other. – Truncated 22/4, 2009 at 12:38

@litb - yes, I suspect the slight differences on my machine may be due to its lack of memory. I didn't mean to imply the difference was significant. – Goddard 22/4, 2009 at 12:48

on x86 and similar platforms, *(a) requires the same cycles as executes *(a+b) at least for selected registers a and b. There are minor differences - e.g. instruction length and, sometimes, pairing, but generally they should run in the same. – Winwaloe 22/4, 2009 at 14:2

@anon: It's not about higher resolution. It's about using clock() rather than time() to explicitly ignore "all the other activities that can be gonig on in a modern OS while your code runs". clock() measures CPU time used for that process alone. – Nixie 5/3, 2011 at 19:20

Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn) Target: x86_64-apple-darwin12. – Pye 28/9, 2013 at 7:22

You are instantiating the end iterator every time you loop, and iterator instantiation aren't free. Try caching your end iterator. Try this: for(vector<int>::iterator it = v.begin(), end= v.end(); it != end; ++it) { ... } – Apeman 15/3, 2015 at 12:41

A

46

Using an iterator results in incrementing a pointer (for incrementing) and for dereferencing into dereferencing a pointer.
With an index, incrementing should be equally fast, but looking up an element involves an addition (data pointer+index) and dereferencing that pointer, but the difference should be marginal.
at() also checks if the index is within the bounds, so it could be slower.

Benchmark results for 500M iterations, vector size 10, with gcc 4.3.3 (-O3), linux 2.6.29.1 x86_64:
at(): 9158ms
operator[]: 4269ms
iterator: 3914ms

YMMV, but if using an index makes the code more readable/understandable, you should do it.

2021 update

With modern compilers, all options are practically free, but iterators are very slightly better for iterating and easier to use with range-for loops (for(auto& x: vs)).

Code:

#include <vector>

void iter(std::vector<int> &vs) {
    for(std::vector<int>::iterator it = vs.begin(); it != vs.end(); ++it)
        *it = 5;
}

void index(std::vector<int> &vs) {
    for(std::size_t i = 0; i < vs.size(); ++i)
        vs[i] = 5;
}

void at(std::vector<int> &vs) {
    for(std::size_t i = 0; i < vs.size(); ++i)
        vs.at(i) = 5;
}

The generated assembly for index() and at() is identical ([godbolt])(https://godbolt.org/z/cv6Kv4b6f), but the loop setup for iter() is three instructions shorter:

iter(std::vector<int, std::allocator<int> >&):
        mov     rax, QWORD PTR [rdi]
        mov     rdx, QWORD PTR [rdi+8]
        cmp     rax, rdx
        je      .L1
.L3:                              ; loop body
        mov     DWORD PTR [rax], 5
        add     rax, 4
        cmp     rax, rdx
        jne     .L3
.L1:
        ret
index(std::vector<int, std::allocator<int> >&):
        mov     rax, QWORD PTR [rdi]
        mov     rdx, QWORD PTR [rdi+8]
        sub     rdx, rax
        mov     rcx, rdx
        shr     rcx, 2
        je      .L6
        add     rdx, rax
.L8:                              ; loop body
        mov     DWORD PTR [rax], 5
        add     rax, 4
        cmp     rdx, rax
        jne     .L8
.L6:
        ret

Athanasius answered 22/4, 2009 at 10:57 Comment(8)

Which OS and compiler were the profiled results from? Which implementation of STL were they using? Were the results made with or without optimizations turned on? Be careful, all of this may change the results. To be sure you should profile your own code in your own environment. – Interrogative 22/4, 2009 at 11:11

-1 sorry. If you look here: velocityreviews.com/forums/…, you'll see that this guy didn't use any compiler optimisation flags, so the results are essentially meaningless. – Witcher 22/4, 2009 at 11:12

-1 Agree with j_random_hacker - if you read the thread all the way through, there's some interesting stuff about the pitfalls of profiling, and also some more reliable results. – Evangelin 22/4, 2009 at 11:14

-1, indeed. Quoting numbers without understanding them seems to be a trap that got both tstennner and the bencmarker. – Embrace 22/4, 2009 at 11:15

+2 now that you've updated with more sensible measuring criteria :) – Witcher 22/4, 2009 at 16:55

A natural question: why is at() so much slower than operator[]? They are essentially synonyms, so why aren't they implemented identically? – Biisk 16/3, 2015 at 20:33

@Biisk at() performs bounds checking, so it's data[i] vs. if(i<length) data[i] – Athanasius 16/3, 2015 at 21:14

@Biisk Also, it's "so much slower" only when compared to something extremely fast to begin with; the vector is designed around efficient random access. That at takes twice as much time only means that fetching the element length and doing the comparison takes about as much time as fetching another element - which is a sensible result. – Alleyne 1/11, 2016 at 9:39

G

30

Why not write a test and find out?

Edit: My bad - I thought I was timing the optimised version but wasn't. On my machine, compiled with g++ -O2, the iterator version is slightly slower than the operator[] version, but probably not significantly so.

#include <vector>
#include <iostream>
#include <ctime>
using namespace std;

int main() {
    const int BIG = 20000000;
    vector <int> v;
    for ( int i = 0; i < BIG; i++ ) {
        v.push_back( i );
    }

    int now = time(0);
    cout << "start" << endl;
    int n = 0;
    for(vector<int>::iterator it = v.begin(); it != v.end(); ++it) {
        n += *it;
    }

    cout << time(0) - now << endl;
    now = time(0);
    for(size_t i = 0; i < v.size(); ++i) {
        n += v[i];
    }
    cout << time(0) - now << endl;

    return n != 0;
}