Fast intersection of sets: C++ vs C#

Asked 29/6, 2009 at 21:23 Answered 26/8, 2010 at 3:38

On my machine (Quad core, 8gb ram), running Vista x64 Business, with Visual Studio 2008 SP1, I am trying to intersect two sets of numbers very quickly.

I've implemented two approaches in C++, and one in C#. The C# approach is faster so far, I'd like to improve the C++ approach so its faster than C#, which I expect C++ can do.

Here is the C# output: (Release build)

Found the intersection 1000 times, in 4741.407 ms

Here is the initial C++ output, for two different approaches (Release x64 build):

Found the intersection (using unordered_map) 1000 times, in 21580.7ms
Found the intersection (using set_intersection) 1000 times, in 22366.6ms

Here is the latest C++ output, for three approaches (Release x64 build):

Latest benchmark:

Found the intersection of 504 values (using unordered_map) 1000 times, in 28827.6ms
Found the intersection of 495 values (using set_intersection) 1000 times, in 9817.69ms
Found the intersection of 504 values (using unordered_set) 1000 times, in 24769.1ms

So, the set_intersection approach is now approx 2x slower than C#, but 2x faster than the initial C++ approaches.

Latest C++ code:

Code:

// MapPerformance.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <hash_map>
#include <vector>
#include <iostream>
#include <time.h>
#include <algorithm>
#include <set>
#include <unordered_set>

#include <boost\unordered\unordered_map.hpp>

#include "timer.h"

using namespace std;
using namespace stdext;
using namespace boost;
using namespace tr1;


int runIntersectionTest2(const vector<int>& set1, const vector<int>& set2)
{
    // hash_map<int,int> theMap;
    // map<int,int> theMap;
    unordered_set<int> theSet;      

     theSet.insert( set1.begin(), set1.end() );

    int intersectionSize = 0;

    vector<int>::const_iterator set2_end = set2.end();

    for ( vector<int>::const_iterator iterator = set2.begin(); iterator != set2_end; ++iterator )
    {
        if ( theSet.find(*iterator) != theSet.end() )
        {
                intersectionSize++;
        }
    }

    return intersectionSize;
}

int runIntersectionTest(const vector<int>& set1, const vector<int>& set2)
{
    // hash_map<int,int> theMap;
    // map<int,int> theMap;
    unordered_map<int,int> theMap;  

    vector<int>::const_iterator set1_end = set1.end();

    // Now intersect the two sets by populating the map
    for ( vector<int>::const_iterator iterator = set1.begin(); iterator != set1_end; ++iterator )
    {
        int value = *iterator;

        theMap[value] = 1;
    }

    int intersectionSize = 0;

    vector<int>::const_iterator set2_end = set2.end();

    for ( vector<int>::const_iterator iterator = set2.begin(); iterator != set2_end; ++iterator )
    {
        int value = *iterator;

        unordered_map<int,int>::iterator foundValue = theMap.find(value);

        if ( foundValue != theMap.end() )
        {
            theMap[value] = 2;

            intersectionSize++;
        }
    }

    return intersectionSize;

}

int runSetIntersection(const vector<int>& set1_unsorted, const vector<int>& set2_unsorted)
{   
    // Create two vectors
    std::vector<int> set1(set1_unsorted.size());
    std::vector<int> set2(set2_unsorted.size());

    // Copy the unsorted data into them
    std::copy(set1_unsorted.begin(), set1_unsorted.end(), set1.begin());
    std::copy(set2_unsorted.begin(), set2_unsorted.end(), set2.begin());

    // Sort the data
    sort(set1.begin(),set1.end());
    sort(set2.begin(),set2.end());

    vector<int> intersection;
    intersection.reserve(1000);

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), back_inserter(intersection));

    return intersection.size(); 
}

void createSets( vector<int>& set1, vector<int>& set2 )
{
    srand ( time(NULL) );

    set1.reserve(100000);
    set2.reserve(1000);

    // Create 100,000 values for set1
    for ( int i = 0; i < 100000; i++ )
    {
        int value = 1000000000 + i;
        set1.push_back(value);
    }

    // Try to get half of our values intersecting
    float ratio = 200000.0f / RAND_MAX;


    // Create 1,000 values for set2
    for ( int i = 0; i < 1000; i++ )
    {
        int random = rand() * ratio + 1;

        int value = 1000000000 + random;
        set2.push_back(value);
    }

    // Make sure set1 is in random order (not sorted)
    random_shuffle(set1.begin(),set1.end());
}

int _tmain(int argc, _TCHAR* argv[])
{
    int intersectionSize = 0;

    vector<int> set1, set2; 
    createSets( set1, set2 );

    Timer timer;
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runIntersectionTest(set1, set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using unordered_map) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    timer.Reset();
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runSetIntersection(set1,set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using set_intersection) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    timer.Reset();
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runIntersectionTest2(set1,set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using unordered_set) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    getchar();

    return 0;
}

C# code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace DictionaryPerformance
{
    class Program
    {
        static void Main(string[] args)
        {
            List<int> set1 = new List<int>(100000);
            List<int> set2 = new List<int>(1000);

            // Create 100,000 values for set1
            for (int i = 0; i < 100000; i++)
            {
                int value = 1000000000 + i;
                set1.Add(value);
            }

            Random random = new Random(DateTime.Now.Millisecond);

            // Create 1,000 values for set2
            for (int i = 0; i < 1000; i++)
            {
                int value = 1000000000 + (random.Next() % 200000 + 1);
                set2.Add(value);
            }

            long start = System.Diagnostics.Stopwatch.GetTimestamp();
            for (int i = 0; i < 1000; i++)
            {
                runIntersectionTest(set1,set2);
            }
            long duration = System.Diagnostics.Stopwatch.GetTimestamp() - start;

            Console.WriteLine(String.Format("Found the intersection 1000 times, in {0} ms", ((float) duration * 1000.0f) / System.Diagnostics.Stopwatch.Frequency));

            Console.ReadKey();
        }

        static int runIntersectionTest(List<int> set1, List<int> set2)
        {

            Dictionary<int,int> theMap = new Dictionary<int,int>(100000);

            // Now intersect the two sets by populating the map
            foreach( int value in set1 )
            {
                theMap[value] = 1;
            }

            int intersectionSize = 0;

            foreach ( int value in set2 )
            {
                int count;
                if ( theMap.TryGetValue(value, out count ) )
                {
                    theMap[value] = 2;
                    intersectionSize++;
                }
            }

            return intersectionSize;
        }
    }
}

C++ code:

// MapPerformance.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <hash_map>
#include <vector>
#include <iostream>
#include <time.h>
#include <algorithm>
#include <set>

#include <boost\unordered\unordered_map.hpp>

#include "timer.h"

using namespace std;
using namespace stdext;
using namespace boost;

int runIntersectionTest(vector<int> set1, vector<int> set2)
{
    // hash_map<int,int> theMap;
    // map<int,int> theMap;
    unordered_map<int,int> theMap;

    // Now intersect the two sets by populating the map
    for ( vector<int>::iterator iterator = set1.begin(); iterator != set1.end(); iterator++ )
    {
        int value = *iterator;

        theMap[value] = 1;
    }

    int intersectionSize = 0;

    for ( vector<int>::iterator iterator = set2.begin(); iterator != set2.end(); iterator++ )
    {
        int value = *iterator;

        unordered_map<int,int>::iterator foundValue = theMap.find(value);

        if ( foundValue != theMap.end() )
        {
            theMap[value] = 2;

            intersectionSize++;
        }
    }

    return intersectionSize;

}

int runSetIntersection(set<int> set1, set<int> set2)
{   
    set<int> intersection;

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), inserter(intersection, intersection.end()));

    return intersection.size(); 
}



int _tmain(int argc, _TCHAR* argv[])
{
    srand ( time(NULL) );

    vector<int> set1;
    vector<int> set2;

    set1.reserve(10000);
    set2.reserve(1000);

    // Create 100,000 values for set1
    for ( int i = 0; i < 100000; i++ )
    {
        int value = 1000000000 + i;
        set1.push_back(value);
    }

    // Create 1,000 values for set2
    for ( int i = 0; i < 1000; i++ )
    {
        int random = rand() % 200000 + 1;
        random *= 10;

        int value = 1000000000 + random;
        set2.push_back(value);
    }


    Timer timer;
    for ( int i = 0; i < 1000; i++ )
    {
        runIntersectionTest(set1, set2);
    }
    timer.Stop();

    cout << "Found the intersection (using unordered_map) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    set<int> set21;
    set<int> set22;

    // Create 100,000 values for set1
    for ( int i = 0; i < 100000; i++ )
    {
        int value = 1000000000 + i;
        set21.insert(value);
    }

    // Create 1,000 values for set2
    for ( int i = 0; i < 1000; i++ )
    {
        int random = rand() % 200000 + 1;
        random *= 10;

        int value = 1000000000 + random;
        set22.insert(value);
    }

    timer.Reset();
    for ( int i = 0; i < 1000; i++ )
    {
        runSetIntersection(set21,set22);
    }
    timer.Stop();

    cout << "Found the intersection (using set_intersection) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    getchar();

    return 0;
}

Ok, here is the latest, with some changes:

The C++ sets are now properly setup so they have a 50% intersection (like the C#)
Set1 is shuffled so its not sorted, set2 was already not sorted
The set_intersection implementation now uses vectors, and sorts them first

C++ (Release, x64) Results:

Found the intersection of 503 values (using unordered_map) 1000 times, in 35131.1ms
Found the intersection of 494 values (using set_intersection) 1000 times, in 10317ms

So its 2x slower than C#. @Jalf: You're getting some pretty fast numbers, is there something I'm doing wrong here?

C++ Code:

// MapPerformance.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <hash_map>
#include <vector>
#include <iostream>
#include <time.h>
#include <algorithm>
#include <set>

#include <boost\unordered\unordered_map.hpp>

#include "timer.h"

using namespace std;
using namespace stdext;
using namespace boost;

int runIntersectionTest(const vector<int>& set1, const vector<int>& set2)
{
    // hash_map<int,int> theMap;
    // map<int,int> theMap;
    unordered_map<int,int> theMap;  

    vector<int>::const_iterator set1_end = set1.end();

    // Now intersect the two sets by populating the map
    for ( vector<int>::const_iterator iterator = set1.begin(); iterator != set1_end; ++iterator )
    {
        int value = *iterator;

        theMap[value] = 1;
    }

    int intersectionSize = 0;

    vector<int>::const_iterator set2_end = set2.end();

    for ( vector<int>::const_iterator iterator = set2.begin(); iterator != set2_end; ++iterator )
    {
        int value = *iterator;

        unordered_map<int,int>::iterator foundValue = theMap.find(value);

        if ( foundValue != theMap.end() )
        {
            theMap[value] = 2;

            intersectionSize++;
        }
    }

    return intersectionSize;

}

int runSetIntersection(const vector<int> set1_unsorted, const vector<int> set2_unsorted)
{   
    // Create two vectors
    std::vector<int> set1(set1_unsorted.size());
    std::vector<int> set2(set2_unsorted.size());

    // Copy the unsorted data into them
    std::copy(set1_unsorted.begin(), set1_unsorted.end(), set1.begin());
    std::copy(set2_unsorted.begin(), set2_unsorted.end(), set2.begin());

    // Sort the data
    sort(set1.begin(),set1.end());
    sort(set2.begin(),set2.end());

    vector<int> intersection;
    intersection.reserve(1000);

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), inserter(intersection, intersection.end()));

    return intersection.size(); 
}

void createSets( vector<int>& set1, vector<int>& set2 )
{
    srand ( time(NULL) );

    set1.reserve(100000);
    set2.reserve(1000);

    // Create 100,000 values for set1
    for ( int i = 0; i < 100000; i++ )
    {
        int value = 1000000000 + i;
        set1.push_back(value);
    }

    // Try to get half of our values intersecting
    float ratio = 200000.0f / RAND_MAX;


    // Create 1,000 values for set2
    for ( int i = 0; i < 1000; i++ )
    {
        int random = rand() * ratio + 1;

        int value = 1000000000 + random;
        set2.push_back(value);
    }

    // Make sure set1 is in random order (not sorted)
    random_shuffle(set1.begin(),set1.end());
}

int _tmain(int argc, _TCHAR* argv[])
{
    int intersectionSize = 0;

    vector<int> set1, set2; 
    createSets( set1, set2 );

    Timer timer;
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runIntersectionTest(set1, set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using unordered_map) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    timer.Reset();
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runSetIntersection(set1,set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using set_intersection) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    getchar();

    return 0;
}

Lutherlutheran answered 29/6, 2009 at 21:23 Comment(12)

I doubt C# could do a task such as this significantly faster. You should be looking at very similar performances. – Locular 29/6, 2009 at 21:26

Your timing in both cases is flawed. If you're interested in the time it takes to perform theintersection, you should time that, not the time it takes to construct and populate the lists/vectors. – Contour 29/6, 2009 at 21:49

I would time the C# test with System.Diagnostics.Stopwatch, it's far more accurate than the DateTime method which can be off by several milliseconds. – Donielle 29/6, 2009 at 21:52

For a start I'd look at profiling the code to determine which part(s) are taking the bulk of the time. I'm not a Win32 programmer but I imagine there must be something similar to OProfile / gprof on Linux. – Unconstitutional 29/6, 2009 at 21:53

@JulianR: Agreed. Along the same lines; I'd consider increasing the test size, or run the test 1000 times - it's much easier to have confidence in timings when you can actually see that they take >1s – Unconstitutional 29/6, 2009 at 21:54

JulianR: I updated it to use System.Diagnostics.Stopwatch – Lutherlutheran 29/6, 2009 at 22:5

Dave Rigby: I updated it to run the test 1000 times. – Lutherlutheran 29/6, 2009 at 22:11

Jalf: I updated it to not include the population of the lists/vectors. – Lutherlutheran 29/6, 2009 at 22:24

The Timer class, what timer API does it use? – Ulmaceous 29/6, 2009 at 22:37

Timer uses QueryPerformanceFrequency and QueryPerformanceCounter. – Lutherlutheran 29/6, 2009 at 22:37

@ Dave Rigby - The Stopwatch class is actually much easier to implement than in your example. It's basically Stopwatch sw = Stopwatch.StartNew(); sw.Stop(); Console.WriteLine(sw.Elapsed); – Donielle 29/6, 2009 at 22:41

Try running the code I posted. I'm curious what timings you get on it. – Contour 29/6, 2009 at 23:9

There are several problems with your test.

First, you are not testing set intersection, but "create a couple of arrays, fill them with random numbers, and then perform set intersection". You should only time the portion of the code you're actually interested in. Even if you're going to want to do those things, they should not be benchmarked here. Measure one thing at a time, to reduce uncertainty. If you want your C++ implementation to perform better, you first need to know which part of it is slower than expected. Which means you have to separate setup code from intersection test.

Second, you should run the test a large number of times to take possible caching effects and other uncertainties into account. (And probably output one total time for, say, 1000 runs, rather than an individual time for each. That way you reduce the uncertainty from the timer which might have limited resolution and report inaccurate results when used in the 0-20ms range.

Further, as far as I can read from the docs, the input to set_intersection should be sorted, which set2 won't be. An there seems to be no reason to use unordered_map, when unordered_set would be a far better match for what you're doing.

About the setup code being needed, note that you probably don't need to populate vectors in order to run the intersection. Both your own implementation and set_intersection work on iterators already, so you can simply pass them a pair of iterators to the data structures your inputs are in already.

A few more specific comments on your code:

Use ++iterator instead of iterator++
rather than calling vector.end() at each loop iteration, call it once and cache the result
experiment with using sorted vectors vs std::set vs unordered_set (not unordered_map)

Edit:

I haven't tried your C# version, so I can't compare the numbers properly, but here's my modified test. Each is run 1000 times, on a Core 2 Quad 2.5GHz with 4GB RAM:

std::set_intersection on std::set: 2606ms
std::set_intersection on tr1::unordered_set: 1014ms
std::set_intersection on sorted vectors: 171ms
std::set_intersection on unsorted vectors: 10140ms

The last one is a bit unfair, because it has to both copy and sort the vectors. Ideally, only the sort should be part of the benchmark. I tried creating a version that used an array of 1000 unsorted vectors (so I woudln't have to copy the unsorted data in each iteration), but the performance was about the same, or a bit worse, because this would cause constant cache misses, so I reverted back to this version

And my code:

#define _SECURE_SCL 0

#include <ctime>
#include <vector>
#include <set>
#include <iostream>
#include <algorithm>
#include <unordered_set>
#include <windows.h>

template <typename T, typename OutIter>
void stl_intersect(const T& set1, const T& set2, OutIter out){
    std::set_intersection(set1.begin(), set1.end(), set2.begin(), set2.end(), out);
}

template <typename T, typename OutIter>
void sort_stl_intersect(T& set1, T& set2, OutIter out){
    std::sort(set1.begin(), set1.end());
    std::sort(set2.begin(), set2.end());
    std::set_intersection(set1.begin(), set1.end(), set2.begin(), set2.end(), out);
}


template <typename T>
void init_sorted_vec(T first, T last){
    for ( T cur = first; cur != last; ++cur)
    {
        int i = cur - first;
        int value = 1000000000 + i;
        *cur = value;
    }
}

template <typename T>
void init_unsorted_vec(T first, T last){
    for ( T cur = first; cur != last; ++cur)
    {
        int i = rand() % 200000 + 1;
        i *= 10;

        int value = 1000000000 + i;
        *cur = value;
    }
}

struct resize_and_shuffle {
    resize_and_shuffle(int size) : size(size) {}

    void operator()(std::vector<int>& vec){
        vec.resize(size);

    }
    int size;
};

int main()
{
    srand ( time(NULL) );
    std::vector<int> out(100000);

    std::vector<int> sortedvec1(100000);
    std::vector<int> sortedvec2(1000);

    init_sorted_vec(sortedvec1.begin(), sortedvec1.end());
    init_unsorted_vec(sortedvec2.begin(), sortedvec2.end());
    std::sort(sortedvec2.begin(), sortedvec2.end());

    std::vector<int> unsortedvec1(sortedvec1.begin(), sortedvec1.end());
    std::vector<int> unsortedvec2(sortedvec2.begin(), sortedvec2.end());

    std::random_shuffle(unsortedvec1.begin(), unsortedvec1.end());
    std::random_shuffle(unsortedvec2.begin(), unsortedvec2.end());

    std::vector<int> vecs1[1000];
    std::vector<int> vecs2[1000];

    std::fill(vecs1, vecs1 + 1000, unsortedvec1);
    std::fill(vecs2, vecs2 + 1000, unsortedvec2);

    std::set<int> set1(sortedvec1.begin(), sortedvec1.end());
    std::set<int> set2(sortedvec2.begin(), sortedvec2.end());

    std::tr1::unordered_set<int> uset1(sortedvec1.begin(), sortedvec1.end());
    std::tr1::unordered_set<int> uset2(sortedvec2.begin(), sortedvec2.end());

    DWORD start, stop;
    DWORD delta[4];

    start = GetTickCount();
    for (int i = 0; i < 1000; ++i){
        stl_intersect(set1, set2, out.begin());
    }
    stop = GetTickCount();
    delta[0] = stop - start;

    start = GetTickCount();
    for (int i = 0; i < 1000; ++i){
        stl_intersect(uset1, uset2, out.begin());
    }
    stop = GetTickCount();
    delta[1] = stop - start;

    start = GetTickCount();
    for (int i = 0; i < 1000; ++i){
        stl_intersect(sortedvec1, sortedvec2, out.begin());
    }
    stop = GetTickCount();
    delta[2] = stop - start;

    start = GetTickCount();
    for (int i = 0; i < 1000; ++i){
        sort_stl_intersect(vecs1[i], vecs1[i], out.begin());
    }
    stop = GetTickCount();
    delta[3] = stop - start;

    std::cout << "std::set_intersection on std::set: " << delta[0] << "ms\n";
    std::cout << "std::set_intersection on tr1::unordered_set: " << delta[1] << "ms\n";
    std::cout << "std::set_intersection on sorted vectors: " << delta[2] << "ms\n";
    std::cout << "std::set_intersection on unsorted vectors: " << delta[3] << "ms\n";


    return 0;
}

There's no reason why C++ should always be faster than C#. C# has a few key advantages that require a lot of care to compete with in C++. The primary one I can think of is that dynamic allocations are ridiculously cheap in .NET-land. Every time a C++ vector, set or unordered_set (or any other container) has to resize or expand, it is a very costly malloc operation. In .NET, a heap allocation is little more than adding an offset to a pointer.

So if you want the C++ version to compete, you'll probably have to solve that, allowing your containers to resize without having to perform actual heap allocations, probably by using custom allocators for the containers (perhaps boost::pool might be a good bet, or you can try rolling your own)

Another issue is that set_difference only works on sorted input, and in order to reproduce tests results that involve a sort, we have to make a fresh copy of the unsorted data in each iteration, which is costly (although again, using custom allocators will help a lot). I don't know what form your input takes, but it is possible that you can sort your input directly, without copying it, and then run set_difference directly on that. (That would be easy to do if your input is an array or a STL container at least.)

One of the key advantages of the STL is that it is so flexible, it can work on pretty much any input sequence. In C#, you pretty much have to copy the input to a List or Dictionary or something, but in C++, you might be able to get away with running std::sort and set_intersection on the raw input.

Finally, of course, try running the code through a profiler and see exactly where the time is being spent. You might also want to try running the code through GCC instead. It's my impression that STL performance in MSVC is sometimes a bit quirky. It might be worth testing under another compiler just to see if you get similar timings there.

Finally, you might find these blog posts relevant for performance of C++ vs C#: http://blogs.msdn.com/ricom/archive/2005/05/10/416151.aspx

The morale of those is essentially that yes, you can get better performance in C++, but it is a surprising amount of work.

Kassie answered 29/6, 2009 at 21:23 Comment(25)

Agreed, the test is not strictly set intersection, it also includes populating any data structures needed for the test. I've updated the test to run 1000 times. – Lutherlutheran 29/6, 2009 at 22:11

Populating the data structures is not part of this test though. Measure that separately. You are introducing a huge amount of uncertainty which essentially invalidates your results. – Contour 29/6, 2009 at 22:13

I modified the code and re-ran the benchmarks with the population of the data structures done before the timed tests. – Lutherlutheran 29/6, 2009 at 22:24

Jalf: it looks to me like std::set is a sorted data structure. sgi.com/tech/stl/set.html – Lutherlutheran 29/6, 2009 at 22:28

oh right, I thought you were running that one on vectors too. (Why aren't you?) – Contour 29/6, 2009 at 22:28

because the first example I found of how to use set_intersection showed it working on sets. If you think switching to vectors will speed things up I will give it a try. – Lutherlutheran 29/6, 2009 at 22:30

I don't know if it'll speed it up, but set_intersection works on any input iterator, as long as it is a sorted sequence (so if you use vectors, you'll have to first sort it) – Contour 29/6, 2009 at 22:33

@Jalf: I modified the code to pass sorted vectors into set_intersection. If I sorted them before the test, then the intersection was done in 352ms (1000 times!) so very fast. But if I do the sort inside the test, then it takes 2401ms. – Lutherlutheran 29/6, 2009 at 22:48

2401 is still twice as good as your C# code though, isn't it? – Contour 29/6, 2009 at 22:51

yes, 2401ms twice as fast as C# and very good.. I like it. I'm just wondering if its a valid test given the data in set1 is already sorted (my real data is not). I'm going to look at updating the test to work on two random (not sorted) tests. – Lutherlutheran 29/6, 2009 at 22:53

hey Jalf, thats awesome, very fast.. However, would it be easy to adjust so that the vectors are re-sorted each time? My data isn't sorted, so I am going to have to sort it. In real life this won't be run 1000 times on pre-sorted vectors. – Lutherlutheran 29/6, 2009 at 23:12

Done. I test with both pre-sorted and unsorted vectors now. It looks like both set and unordered_set are a lot faster than in your own tests though. Can you reproduce the same results if you run my code? – Contour 29/6, 2009 at 23:20

Nice. One more question: set1 is already sorted... I'm trying to find a good way to randomize it while still ensuring it has values from 0 to 100,000 (excluding the offset). – Lutherlutheran 29/6, 2009 at 23:22

looks like std::random_shuffle can do that :) – Contour 29/6, 2009 at 23:23

I posted an updated benchmark and code below, I'm not seeing the crazy fast results you are Jalf. (it was fast, until I shuffled set1 and sorted it each time) – Lutherlutheran 29/6, 2009 at 23:34

You're now getting the same times I'm getting, about 10s to use set_intersection on unsorted lists. This is a 2x improvement on where we started, but still 2x slower than C#. – Lutherlutheran 29/6, 2009 at 23:47

Does that one matter though? If set or unsorted_set is still 4x faster than C# – Contour 29/6, 2009 at 23:59

Set: the sorting is done outside the benchmark... so doesn't seem fair. Unordered_set: does this work? You're passing what looks like an unsorted set to set_intersection... – Lutherlutheran 30/6, 2009 at 0:3

Set is an ordered data structure, remember? So whether the input it is built from is sorted or not doesn't matter. It sorts its input on insertion anyway. About unordered_set, I think you're right, that isn't valid, since it obviously isn't sorted. Hadn't really thought of that. :) – Contour 30/6, 2009 at 0:9

Set: yes, so essentially it 'sorts' as you insert.. Insertions are slower into sets than into vectors. So if I started with my unsorted data, it will take me longer to get it into a set than a vector. – Lutherlutheran 30/6, 2009 at 0:12

I adjusted the C# code now, to shuffle set1, it didn't make any diffence. The C# method doesn't care if the sets are sorted or not. – Lutherlutheran 30/6, 2009 at 0:21

yeah, I realized the same thing. I implemented the same algorithm again in C++, and so far, getting around 600ms on it, but I think I have a bug or two, so it'll be a few mins before I post it. – Contour 30/6, 2009 at 0:39

cool - one thing to do is to output the count of items in the intersection, as a sanity check (it should be ~500) – Lutherlutheran 30/6, 2009 at 0:48

yep, it was a bug, was testing intersect(set2, set2) :) Well, it's bedtime for me. Going to add a bit to my answer, and then head off. :) – Contour 30/6, 2009 at 0:52

Ok, posted my final update for tonight. No new code or benchmarks, but a few hints on what you should probably try next. I might give it a shot again tomorrow. :) – Contour 30/6, 2009 at 1:11

One problem I see right away is that you're passing the sets in C++ by value and not by const reference. So you're copying them every time you pass them around!

Also, I would not use a set for the target of set_intersection. I would use something like

int runSetIntersection(const set<int>& set1, const set<int>& set2)
{   
    vector<int> intersection;
    intersection.reserve(10000) // or whatever the max is

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), back_inserter(intersection));

    return intersection.size(); 
}

This code, however, still allocates inside the function. Even faster would be

int runSetIntersection(const set<int>& set1, const set<int>& set2, vector<int>& scratch)
{   
    scratch.reserve(10000) // or whatever the max is

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), back_inserter(scratch));

    return scratch.size(); 
}

And then allocate scratch before you start the timer.

Though, if you're just looking for the size, a hand-written for loop, combined with set::find might give even better results.

Garbo answered 29/6, 2009 at 22:49 Comment(1)

Good spotting! That ought to count for some of the slowness. – Ulmaceous 29/6, 2009 at 22:52

Use this...

vector<int> set1(10000);
vector<int> set2(1000);

... to get vectors of non-zero initial size. Then don't use push_back, but just update the values directly.

Anticatalyst answered 29/6, 2009 at 21:32 Comment(7)

Or call reserve(1000), but continue using push_back – Avowed 29/6, 2009 at 21:41

or just don't include this in your timing in the first place. Your setup code should not be part of the benchmark. – Contour 29/6, 2009 at 21:49

I want to include the setup time, because my data is not in one of these data structures at the start... so if the approach requires me to put it into a data structure then I need to include that tme. – Lutherlutheran 29/6, 2009 at 22:0

@jalf. Fair point. I was assuming he actually wanted to measure that but as well... – Anticatalyst 29/6, 2009 at 22:1

I agree that there is a vast difference between initializing something to 10000 bytes for the C# and doing multiple re-allocs on the C++ side. – Vendace 29/6, 2009 at 22:2

nevertheless, it is not part of this test. You'll get much more useful benchmarking data by treating them as separate problems. First worry about the intersection performance, and then do a separate test for the "build initial data structures" part. – Contour 29/6, 2009 at 22:2

I modified the C++ implementation that uses ordered_set and vector to call reserve() on the vectors. Not much change. – Lutherlutheran 29/6, 2009 at 22:18

I would change the C++ "runIntersectionTest" to take const references to the containers rather than having them copy-constructed on each call. (The C# code will be using refs.)

Alcantara answered 29/6, 2009 at 22:55 Comment(0)

It may also be worthwhile looking at the boost Disjoint Set container, which is specially optimized for certain kinds of large set operations.

It works by treating a group of sets as the unions of several disjoint sets, making it possible to build other sets, such as intersections or unions very cheaply, once the initial set of disjoint sets is constructed. If you expect to be doing a lot of set operations on sets that don't change much, you can probably expect this to be very fast. If, on the other hand, you will use each set once and throw it away, it's probably not going to do too much.

Anyway, you'd be doing yourself a favor to at least experiment with this to see if it gives you any bump in your specific case.

Hillery answered 29/6, 2009 at 23:11 Comment(1)

This structure has just given me a chill, for it contains the first practical use of the Ackermann function, or rather its inverse. Amazing! – Hillery 29/6, 2009 at 23:14

By the way, if you have large sorted sets std::set_intersection is not the fastest algorithm. std::set_intersection takes up to 2*(m+n)-1 comparisons but algorithms like the one from Baeza-Yates can be faster. For small m, Baeza-Yates is O(m * log(n)), while for n = alpha * m it is O(n). The basic idea is to do a kind of 2 way binary search.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.7899&rep=rep1&type=pdf

Experimental Analysis of a Fast Intersection Algorithm for Sorted Sequences Ricardo Baeza-Yates and Alejandro Salinger

R. Baeza-Yates. A Fast Set Intersection Algorithm for Sorted Sequences. In Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching (CPM 2004), Springer LNCS 3109, pp 400-408, Istanbul, Turkey, July 2004.

Below is an explanation and an implementation by Erik Frey where he shows significantly faster results than std::set_intersection with a binary probe. I have not tried his code yet.
http://fawx.com/

Pick the median element, A, in the smaller set.
Search for its insertion-position element, B, in the larger set.
If A and B are equal, append the element to the result.
Repeat steps 1-4 on non-empty subsets on either side of elements A and B.

;



/*
* baeza_intersect
*/
template< template  class Probe,
  class RandomAccessIterator, class OutputIterator>
void baeza_intersect(RandomAccessIterator begin1, RandomAccessIterator end1,
                     RandomAccessIterator begin2, RandomAccessIterator end2,
                     OutputIterator out)
{
  RandomAccessIterator probe1, probe2;

if ( (end1 - begin1) < ( end2 - begin2 ) )
  {
    if ( begin1 == end1 )
      return;
    probe1 = begin1 + ( ( end1 - begin1 ) >> 1 );
    probe2 = lower_bound< Probe >( begin2, end2, *probe1 );
    baeza_intersect< Probe >(begin1, probe1, begin2, probe2, out); // intersect left
    if (! (probe2 == end2 || *probe1 < *probe2 ))
      *out++ = *probe2++;
    baeza_intersect< Probe >(++probe1, end1, probe2, end2, out); // intersect right
  }
  else
  {
    if ( begin2 == end2 )
      return;
    probe2 = begin2 + ( ( end2 - begin2 ) >> 1 );
    probe1 = lower_bound< Probe >( begin1, end1, *probe2 );
    baeza_intersect< Probe >(begin1, probe1, begin2, probe2, out); // intersect left
    if (! (probe1 == end1 || *probe2 < *probe1 ))
      *out++ = *probe1++;
    baeza_intersect< Probe >(probe1, end1, ++probe2, end2, out); // intersect right
  }
}

/*
* with a comparator
*/
 template< template  class Probe,
      class RandomAccessIterator, class OutputIterator, class Comparator >
    void baeza_intersect(RandomAccessIterator begin1, RandomAccessIterator end1,
                         RandomAccessIterator begin2, RandomAccessIterator end2,
                         OutputIterator out, Comparator cmp)
    {
      RandomAccessIterator probe1, probe2;

  if ( (end1 - begin1) < ( end2 - begin2 ) )
  {
    if ( begin1 == end1 )
      return;
    probe1 = begin1 + ( ( end1 - begin1 ) >> 1 );
    probe2 = lower_bound< Probe >( begin2, end2, *probe1, cmp );
    baeza_intersect< Probe >(begin1, probe1, begin2, probe2, out, cmp); // intersect left
    if (! (probe2 == end2 || cmp( *probe1, *probe2 ) ))
      *out++ = *probe2++;
    baeza_intersect< Probe >(++probe1, end1, probe2, end2, out, cmp); // intersect right
  }
  else
  {
    if ( begin2 == end2 )
      return;
    probe2 = begin2 + ( ( end2 - begin2 ) >> 1 );
    probe1 = lower_bound< Probe >( begin1, end1, *probe2, cmp );
    baeza_intersect< Probe >(begin1, probe1, begin2, probe2, out, cmp); // intersect left
    if (! (probe1 == end1 || cmp( *probe2, *probe1 ) ))
      *out++ = *probe1++;
    baeza_intersect< Probe >(probe1, end1, ++probe2, end2, out, cmp); // intersect right
  }
}


// probe.hpp

/**
* binary probe: pick the next element by choosing the halfway point between low and high
*/
template< class RandomAccessIterator, class T >
struct binary_probe
{
  RandomAccessIterator operator()(RandomAccessIterator begin, RandomAccessIterator end, const T & value)
  {
    return begin + ( (end - begin) >> 1);
  }
};

/**
* lower_bound: like stl's lower_bound but with different kinds of probing
* note the appearance of the rare template parameter template!
*/
template< template  class Probe, class RandomAccessIterator, class T >
RandomAccessIterator lower_bound(RandomAccessIterator begin, RandomAccessIterator end, const T & value)
{
  RandomAccessIterator pit;
  Probe< RandomAccessIterator, T > pfunc; // probe-functor (wants to get func'd up)

while ( begin < end )
  {
    pit = pfunc(begin, end, value);
    if ( *pit < value )
      begin = pit + 1;
    else
      end = pit;
  }
  return begin;
}

/*
* this time with a comparator!
*/
template< template  class Probe, class RandomAccessIterator, class T, class Comparator >
RandomAccessIterator lower_bound(RandomAccessIterator begin, RandomAccessIterator end, const T & value, Comparator cmp)
{
  RandomAccessIterator pit;
  Probe< RandomAccessIterator, T > pfunc;

while ( begin < end )
  {
    pit = pfunc(begin, end, value);
    if ( cmp(*pit, value) )
      begin = pit + 1;
    else
      end = pit;
  }
  return begin;
}

Chloropicrin answered 26/8, 2010 at 3:38 Comment(0)

Since you're using Visual Studio you should check whether you have _SECURE_SCL set to 1 (typically if you haven't explicitly set it it will be 1). If it's set all STL-code will be range-checked, even in release-builds. Typically slowing down code by a 10-15%.

It seems Microsoft wasn't aware that for instance std::vector already has an interface if you want the range-checking: std::vector::at()!

(Sorry, had to get it off my chest).

Anyway the main inefficiency is that you're copying the containers instead of passing them by value. Use references to (try to) compare apples and apples instead of apples and bananas.

Ulmaceous answered 29/6, 2009 at 22:57 Comment(1)

I'll adjust and repost without copying... I have set _SECURE_SCL to 0. (#define _SECURE_SCL 0) – Lutherlutheran 29/6, 2009 at 23:0

I know your solution is working fine, but have you tried using the STL implementations:

It might be optimized for your plataform already, so I'd give it a shot

China answered 29/6, 2009 at 21:48 Comment(1)

My C++ code includes an implementation using set_intersection. I will take a look at "includes". – Lutherlutheran 29/6, 2009 at 22:0

Are C++ optimization flags turned on?

Bb answered 29/6, 2009 at 22:31 Comment(4)

which ones? Optimization is set to "Maximize Speed (/O2)" – Lutherlutheran 29/6, 2009 at 22:32

I set "Favor Size or Speed" to "Favor Fast Code (/Ot)", no difference. – Lutherlutheran 29/6, 2009 at 22:34

Did you figure out what was taking most of the time? – Bb 29/6, 2009 at 22:50

no I haven't figured out what takes most of the time. In the set_intersection test there isn't much going on except calling set_intersection :) – Lutherlutheran 29/6, 2009 at 22:51

Ok, after much feedback I've updated the original question a number of times:

The tests are now each run 1,000 times
The C# code now uses a higher resolution timer
The data structures are now populated BEFORE the tests

The result of this so far is that C# is still ~5x faster than C++.

Thanks everyone for your ideas/suggestions.

Lutherlutheran answered 29/6, 2009 at 22:32 Comment(0)

Update:

I modified the set_intersection code to use vectors, and to sort them (instead of using the sorted set class), and its MUCH faster now:

Found the intersection of 319 values (using unordered_map) 1000 times, in 22187.5ms
Found the intersection of 315 values (using set_intersection) 1000 times, in 2401.62ms

Keep in mind: the larger set is created sorted, so sorting it might not take much time in this example.

C++ Code:

// MapPerformance.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <hash_map>
#include <vector>
#include <iostream>
#include <time.h>
#include <algorithm>
#include <set>

#include <boost\unordered\unordered_map.hpp>

#include "timer.h"

using namespace std;
using namespace stdext;
using namespace boost;

int runIntersectionTest(vector<int> set1, vector<int> set2)
{
    // hash_map<int,int> theMap;
    // map<int,int> theMap;
    unordered_map<int,int> theMap;

    // Now intersect the two sets by populating the map
    for ( vector<int>::iterator iterator = set1.begin(); iterator != set1.end(); iterator++ )
    {
        int value = *iterator;

        theMap[value] = 1;
    }

    int intersectionSize = 0;

    for ( vector<int>::iterator iterator = set2.begin(); iterator != set2.end(); iterator++ )
    {
        int value = *iterator;

        unordered_map<int,int>::iterator foundValue = theMap.find(value);

        if ( foundValue != theMap.end() )
        {
            theMap[value] = 2;

            intersectionSize++;
        }
    }

    return intersectionSize;

}

int runSetIntersection(vector<int> set1, vector<int> set2)
{   
    sort(set1.begin(),set1.end());
    sort(set2.begin(),set2.end());

    set<int> intersection;

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), inserter(intersection, intersection.end()));

    return intersection.size(); 
}



int _tmain(int argc, _TCHAR* argv[])
{
    srand ( time(NULL) );

    vector<int> set1;
    vector<int> set2;

    set1.reserve(10000);
    set2.reserve(1000);

    // Create 100,000 values for set1
    for ( int i = 0; i < 100000; i++ )
    {
        int value = 1000000000 + i;
        set1.push_back(value);
    }

    // Create 1,000 values for set2
    for ( int i = 0; i < 1000; i++ )
    {
        int random = rand() % 200000 + 1;
        random *= 10;

        int value = 1000000000 + random;
        set2.push_back(value);
    }

    int intersectionSize = 0;


    Timer timer;
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runIntersectionTest(set1, set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using unordered_map) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    timer.Reset();
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runSetIntersection(set1,set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using set_intersection) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    getchar();

    return 0;
}

Lutherlutheran answered 29/6, 2009 at 22:50 Comment(3)

Don't pass vectors by value!! const vector<int> &set1 is the C++ way. – Ulmaceous 29/6, 2009 at 22:53

good call, sloppy mistake on my part trying to rapidly change the code. – Lutherlutheran 29/6, 2009 at 22:57

You could write the intersection results to a vector instead of a set (should be a lot faster). Also I'd expect unordered_set to be a lot faster than plain sets – Contour 29/6, 2009 at 23:42

You are STILL passing the vectors by value. Which would be ok if you weren't copying them as well.

inserter was not puting the values at the end of the vector where is it quick. It only did that on the first insert after that it inserted the value at the beginning of the array (where end used to point).

you where looking up the value twice in the hash map version, when you updated the value. Why is this value event being updated?

run this code and post your timings.

// MapPerformance.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <hash_map>
#include <vector>
#include <iostream>
#include <time.h>
#include <algorithm>
#include <set>

#include <boost\unordered\unordered_set.hpp>

#include "timer.h"

using namespace std;
using namespace stdext;
using namespace boost;

int runIntersectionTest(const vector<int>& set1, const vector<int>& set2)
{
    // hash_map<int,int> theMap;
    // map<int,int> theMap;
    unordered_set<int> theSet;      

     theSet.insert( set1.begin(), set2.end() );

    int intersectionSize = 0;

    vector<int>::const_iterator set2_end = set2.end();

    for ( vector<int>::const_iterator iterator = set2.begin(); iterator != set2_end; ++iterator )
    {
        if ( theSet.find(*iterator) != theSet.end() )
        {
                intersectionSize++;
        }
    }

    return intersectionSize;
}

int runSetIntersection( vector<int> set1, vector<int> set2)
{   
    // Sort the data
    sort(set1.begin(),set1.end());
    sort(set2.begin(),set2.end());

    vector<int> intersection;
    intersection.reserve(1000);

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), back_inserter(intersection));

    return intersection.size(); 
}

void createSets( vector<int>& set1, vector<int>& set2 )
{
    srand ( time(NULL) );

    set1.reserve(100000);
    set2.reserve(1000);

    // Create 100,000 values for set1
    for ( int i = 0; i < 100000; i++ )
    {
        int value = 1000000000 + i;
        set1.push_back(value);
    }

    // Try to get half of our values intersecting
    float ratio = 200000.0f / RAND_MAX;


    // Create 1,000 values for set2
    for ( int i = 0; i < 1000; i++ )
    {
        int random = rand() * ratio + 1;

        int value = 1000000000 + random;
        set2.push_back(value);
    }

    // Make sure set1 is in random order (not sorted)
    random_shuffle(set1.begin(),set1.end());
}

int _tmain(int argc, _TCHAR* argv[])
{
    int intersectionSize = 0;

    vector<int> set1, set2;     
    createSets( set1, set2 );

    Timer timer;
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runIntersectionTest(set1, set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using unordered_map) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    timer.Reset();
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runSetIntersection(set1,set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using set_intersection) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    getchar();

    return 0;
}

Ovation answered 30/6, 2009 at 0:35 Comment(1)

I made those changes, no noticeable difference. Your new runIntersectionTest is similar in performance to the unordered_map one (about 2x slower than set_intersection) – Lutherlutheran 30/6, 2009 at 1:43

Latest benchmark:

Found the intersection of 504 values (using unordered_map) 1000 times, in 28827.6ms
Found the intersection of 495 values (using set_intersection) 1000 times, in 9817.69ms
Found the intersection of 504 values (using unordered_set) 1000 times, in 24769.1ms

I think the 504 - 495 difference happens because there are a couple dupe values.

Code:

// MapPerformance.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <hash_map>
#include <vector>
#include <iostream>
#include <time.h>
#include <algorithm>
#include <set>
#include <unordered_set>

#include <boost\unordered\unordered_map.hpp>

#include "timer.h"

using namespace std;
using namespace stdext;
using namespace boost;
using namespace tr1;


int runIntersectionTest2(const vector<int>& set1, const vector<int>& set2)
{
    // hash_map<int,int> theMap;
    // map<int,int> theMap;
    unordered_set<int> theSet;      

     theSet.insert( set1.begin(), set1.end() );

    int intersectionSize = 0;

    vector<int>::const_iterator set2_end = set2.end();

    for ( vector<int>::const_iterator iterator = set2.begin(); iterator != set2_end; ++iterator )
    {
        if ( theSet.find(*iterator) != theSet.end() )
        {
                intersectionSize++;
        }
    }

    return intersectionSize;
}

int runIntersectionTest(const vector<int>& set1, const vector<int>& set2)
{
    // hash_map<int,int> theMap;
    // map<int,int> theMap;
    unordered_map<int,int> theMap;  

    vector<int>::const_iterator set1_end = set1.end();

    // Now intersect the two sets by populating the map
    for ( vector<int>::const_iterator iterator = set1.begin(); iterator != set1_end; ++iterator )
    {
        int value = *iterator;

        theMap[value] = 1;
    }

    int intersectionSize = 0;

    vector<int>::const_iterator set2_end = set2.end();

    for ( vector<int>::const_iterator iterator = set2.begin(); iterator != set2_end; ++iterator )
    {
        int value = *iterator;

        unordered_map<int,int>::iterator foundValue = theMap.find(value);

        if ( foundValue != theMap.end() )
        {
            theMap[value] = 2;

            intersectionSize++;
        }
    }

    return intersectionSize;

}

int runSetIntersection(const vector<int>& set1_unsorted, const vector<int>& set2_unsorted)
{   
    // Create two vectors
    std::vector<int> set1(set1_unsorted.size());
    std::vector<int> set2(set2_unsorted.size());

    // Copy the unsorted data into them
    std::copy(set1_unsorted.begin(), set1_unsorted.end(), set1.begin());
    std::copy(set2_unsorted.begin(), set2_unsorted.end(), set2.begin());

    // Sort the data
    sort(set1.begin(),set1.end());
    sort(set2.begin(),set2.end());

    vector<int> intersection;
    intersection.reserve(1000);

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), back_inserter(intersection));

    return intersection.size(); 
}

void createSets( vector<int>& set1, vector<int>& set2 )
{
    srand ( time(NULL) );

    set1.reserve(100000);
    set2.reserve(1000);

    // Create 100,000 values for set1
    for ( int i = 0; i < 100000; i++ )
    {
        int value = 1000000000 + i;
        set1.push_back(value);
    }

    // Try to get half of our values intersecting
    float ratio = 200000.0f / RAND_MAX;


    // Create 1,000 values for set2
    for ( int i = 0; i < 1000; i++ )
    {
        int random = rand() * ratio + 1;

        int value = 1000000000 + random;
        set2.push_back(value);
    }

    // Make sure set1 is in random order (not sorted)
    random_shuffle(set1.begin(),set1.end());
}

int _tmain(int argc, _TCHAR* argv[])
{
    int intersectionSize = 0;

    vector<int> set1, set2; 
    createSets( set1, set2 );

    Timer timer;
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runIntersectionTest(set1, set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using unordered_map) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    timer.Reset();
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runSetIntersection(set1,set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using set_intersection) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    timer.Reset();
    for ( int i = 0; i < 1000; i++ )
    {
        intersectionSize = runIntersectionTest2(set1,set2);
    }
    timer.Stop();

    cout << "Found the intersection of " << intersectionSize << " values (using unordered_set) 1000 times, in " << timer.GetMilliseconds() << "ms" << endl;

    getchar();

    return 0;
}

Lutherlutheran answered 30/6, 2009 at 0:55 Comment(2)

The code I posted deliberately didn't pass the vectors by reference, so the compiler would implicitly copy them. If you'd like to do it your way, replace to code before the sort with this: vector<int> set1( unsorted_set1 ); vector<int> set2( unsorted_set2 ); this way the vectors are efficiently copied. The code as it stands fills each vector with zeros. Then it copies the desired value over the top. The above method allocates enough space then copies the values from the parameter with no extra assignments. This will speed things up, though I don't know by how much – Ovation 30/6, 2009 at 2:0

I'm not sure I 100% get you. I made a change though, I made runSetIntersection take vector<int> not vector<int>& (so it gets a copy), and then removed the code inside that function that copies the vector. This didn't make a noticeable difference. I do agree in general the benchmark would be more fair if the vector was not copied each time. – Lutherlutheran 30/6, 2009 at 2:14

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags