std::vector faster than std::unordered_set?

Asked 8/4, 2013 at 13:13 Answered 9/7, 2018 at 20:39

Solved c++performance vector stl unordered-set

In my custom physics engine, the biggest bottleneck is a method that gets all bodies from the spatial partitioning (a 2D grid), and returns a collection containing only unique pointers to body.

template<typename T, typename V> bool contains(const T& mContainer, const V& mValue)
{
    return std::find(std::begin(mContainer), 
                     std::end(mContainer), mValue) != std::end(mContainer);
}

const vector<Body*>& GridInfo::getBodiesToCheck()
{
    bodiesToCheck.clear();
    for(auto& query : queries)
        for(auto& body : *query)
            if(!contains(bodiesToCheck, body)) bodiesToCheck.push_back(body);
    return bodiesToCheck;
}

Using a profiler shows that the bottleneck is in the "contains" method.

Obviously, an std::unordered_set would be the "ideal" solution here. However, it is a lot slower than the current solution. I've also tried google::dense_hash_set, which is faster than std::unordered_set, but still slower than the current solution.

const unordered_set<Body*>& GridInfo::getBodiesToCheck()
{
    bodiesToCheck.clear();
    for(auto& query : queries)
        for(auto& body : *query)
            /*if(!contains(bodiesToCheck, body))*/ bodiesToCheck.insert(body);
    return bodiesToCheck;
}

Why are the "correct" containers slower than a std::vector?

Is there any way I can speed up this method further?

Rickierickman answered 8/4, 2013 at 13:13 Comment(7)

The profiling results are just for contains? Remember searching set might be faster but inserting is a slower than vector. – Ima 8/4, 2013 at 13:16

I assume you didn't make such a mistake, but just to be really really sure, you didn't use std::find when you tried the std::unordered_map, did you? – Wildermuth 8/4, 2013 at 13:18

@stardust_ Profiler shows "getBodiesToCheck()" method as the bottleneck. If I use the std::vector version, the bottleneck inside getBodiesToCheck() (bottleneck of the bottleneck :P) is the call to "contains" – Rickierickman 8/4, 2013 at 13:21

@ChristianRau Nope, I removed the "contains" part. – Rickierickman 8/4, 2013 at 13:21

@Vee yes that is what i am saying on the vector version the find might be the bottleneck. however in the set version insertion might be the bottleneck. – Ima 8/4, 2013 at 13:25

@stardust_ I understand. Do you think there is any way to speed up this method? Maybe using a custom stack allocator could improve the performance? – Rickierickman 8/4, 2013 at 13:33

@Vee While my question (from the above comment) might have been a bit stupid at first, it seems many answerers are confused about some possible iteration over std::unordered_set. So it might help to clarifiy things if you just included your std::unordered_set solution alongside the existing std::vector code. – Wildermuth 8/4, 2013 at 13:47

There are two possibilities I can think of:

You have a small enough number of data elements that a linear search is faster than a hash-plus compare lookup.
You're using the same contains function to find an element in the unordered_set, instead of using the member function find.

Torto answered 8/4, 2013 at 13:16 Comment(1)

As I only care about returning a collection of unique Body*, I didn't use "contains" or "find" on unordered_set. I just used insert expecting it to only be filled with unique elements. – Rickierickman 8/4, 2013 at 13:22

If the number of duplicate bodies isn't that high compared to the others, one option might be to just push all your bodies into the vector and remove the duplicates afterwards. But this will require a std::sort followed by an erase(std::unique, end).

But it may be worth a try, considering that your vector seems to outplay a std::unordered_set anyway, which doesn't have the same memory locality and trivial access like a std::vector.

Wildermuth answered 8/4, 2013 at 13:31 Comment(1)

I tried it, but the performance is slower than what I currently have. – Rickierickman 8/4, 2013 at 13:32

I am not sure that I understand the problem correctly, but it seems that the lookup will be slower on std::vector/std::find, but iteration might be faster than with std::unordered_set. If that is the case, and you are not limited by memory constraints, you can mix both approaches:

Maintain both a std::unordered_set and a std::vector with the elements. Lookup inside the std::unordered_set to determine whether the element is already there, if it is not, add it to both containers. At the end iterate over the std::vector.

Note that you can provide hints to both containers regarding the 'expected' number of elements that they will contain, and that will reduce the number of memory allocations/rehashing.

Tomkin answered 8/4, 2013 at 13:40 Comment(2)

I guess that std::unique_set should be a std::unordered_set? Other than that, I don't think there is any need for him to iterate over the std::unordered_set, at least not in the code snippet in question (and the one he profiled and wants to speed up). It's just std::vector+std::find vs std::unordered_set::insert, so in your case he would have the same overhead like his existing std::unordered_set solution and the overhead of a vector insert. – Wildermuth 8/4, 2013 at 13:45

@ChristianRau: Yes, unordered (need to get caffeine infused immediately!) – Demott 8/4, 2013 at 13:47

I run into a similar problem where a linear search is faster than a hash-plus compare lookup.(A support to Mark's first answer).

I try to use BFS to improve the CPU voxelization of a mesh. std::unordered_set is used to mark visited voxels. However, unordered_set is 100% slower than linearly iterating the space. With a profile comparison, I find that a linear search is faster if the ratio of active voxels over all visited voxels is higher than 3%. Otherwise, BFS with unordered_set is better.

Depose answered 9/7, 2018 at 20:39 Comment(0)

-2

Here is what you find in the std documentation:

"unordered_set containers are faster than set containers to access individual elements by their key, although they are generally less efficient for range iteration through a subset of their elements."

Well since the find method will eventually loop through a considerable number of elements thats probably the reason...

Maybe if you've used a costum hash function you should improve it to make it faster... Only thing I can think of...

Picador answered 8/4, 2013 at 13:22 Comment(3)

But then again when using unordered_map there is absolutely no need for std::find anyway (and the OP confirmed to not have done such stupid mistake). – Wildermuth 8/4, 2013 at 13:27

"If you really need better performance, the only data container I can think of would be some kind of hash table" - Uh...you mean...like a std::unordered_set? – Wildermuth 8/4, 2013 at 13:34

Yeah, you're right... Unordered set is indeed an Hash table... My bad. – Picador 8/4, 2013 at 13:44

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags