TBB Concurrent Hash map - McMap

About

TBB Concurrent Hash map

Asked 16/9, 2011 at 19:20 Answered 16/9, 2011 at 21:53

Solved c++hashmap tbb concurrenthashmap

W

1

9

I am implementing tbb's concurrent hash map to compare the performance of it against a suite of other concurrent hash tables.

However the performance I get out of it is horrendous, I just can't believe it is that slow compared to other concurrent hash tables

Here is my implementation of it:

class TBB: public TestDs{
    typedef tbb::concurrent_hash_map<int,int, HashCompare<int> > hash_t;
private:
        hash_t _ds;
public:
        TBB(const Configuration& config) : _ds(config.initial_count) {
        }

    bool containsKey(int key) {
        hash_t::accessor a;

        if(_ds.find(a,key)){
            return true;
        }
        else 
            return false;
    }

    int get(int key) {
        hash_t::accessor a;

        if(_ds.find(a,key)){
             return (int)(a->second);
        }
        else 
            return 0;
    }

    int put(int key, int value) {
        return _ds.insert( std::make_pair(key, value) );
    }

    int remove(int key) {
        return _ds.erase(key);
    }

    int size() {
        return _ds.size();
    }
    const char* name() {
        return "TBB";
    }
    void print() {}
    void shutdown() {}

};

Does anyone see any issue with my implementation or know of any reason why it may perform slow? It takes over 30 mins for it to insert 200,000 elements in a single thread environment. To put that in perspective, nearly all the other tables perform this test in less than 5 mins.

Here is my build code:

-w  -DNDEBUG -g -msse2 -m32  -DINTEL -D_REENTRANT -lrt -pthread -fno-strict-aliasing -l cds -l tbb -lllalloc

UPDATE: I have adjusted my testing code to prefill the hash tables to 1000, instead of 100,000. When running it again, tbb performs 92 op/sec while, another implementation performs 89431 op/sec. (64 thread environment)...Just saying something doesn't seem right....

Additional Info: Computer is a HP Z600 Workstation with 6gb of ram and 6 cores.

Notice Cross positing at : http://software.intel.com/en-us/forums/showthread.php?t=86119

Window answered 16/9, 2011 at 19:20 Comment(10)

You're compiling without optimizations enabled? – Mcpeak 16/9, 2011 at 20:39

And what is the "other implementation"? – Mcpeak 16/9, 2011 at 20:41

Yes with out optimizations, but that is for a different reason. The other hashtables are libcds's split list and michael hash map, as well as my own hashtable – Window 16/9, 2011 at 20:59

It doesn't matter what the reason is, comparing performance with optimizations disabled makes no sense. You might as well compare the speed of two cars when you've taken the wheels off them. – Mcpeak 16/9, 2011 at 21:4

@gman, github.com/MrStevenFeldman/Concurrent-Non-Blocking-HashTable/… – Window 16/9, 2011 at 21:5

Yes but when optimizing concurrent data structures The reording of operations can be very dangerous to your correctness – Window 16/9, 2011 at 21:7

@Steven: That has nothing to do with debugging. If turning optimizations on breaks your code, it's your code that was broken in the first place. – Knifeedged 16/9, 2011 at 21:11

I added the -O3 Flag, it had very little improvement in the TBB test run. Is there any other recommended changes to be made? or configuration of tbb itself? – Window 16/9, 2011 at 21:22

@Steven: I don't have time to look too deep at it, hopefully someone else can, but look at it like this: the TBB data structures are not just thrown-together hobbyist code, they were programmed by some of the best, and tested thereon out. If you're having performance problems, either the structures you're comparing it to are broken (for example, don't properly handle thread-safety), or you're test code doesn't reflect how the data structure is suppose to be used in practice. I really highly suspect it's not TBB's fault. Try looking to see if you can simplify the tests and identify problems. – Knifeedged 16/9, 2011 at 21:57

I never thought it was tbb's fault, its that I had spent a week trying to isolate why it was so slow, and I overlooked the hash function hashing to a constant – Window 16/9, 2011 at 22:28

C

12

You HashCompare::hash() returns sizeof(int), which, I guess, means every entry maps into the same bucket. It seems like you are not using it as a hash table, more of a linked list.

You could try using Boost's hash:

#include <boost/functional/hash.hpp>

template<typename K> 
struct HashCompare { 
    static size_t hash( const K& key )                  { return boost::hash_value(key); } 
    static bool   equal( const K& key1, const K& key2 ) { return ( key1 == key2 ); } 
};

Clockwise answered 16/9, 2011 at 21:53 Comment(9)

Where do you see the implementation of HashCompare? – Mcpeak 16/9, 2011 at 21:58

In github.com/MrStevenFeldman/Concurrent-Non-Blocking-HashTable/… – Clockwise 16/9, 2011 at 21:59

OMG THANKS! THat makes so much sense! – Window 16/9, 2011 at 22:7

You are welcome. I hope it works. I'd be curious to know how it actually compares to other implementations. – Clockwise 16/9, 2011 at 22:8

The hash compare function was a copy and paste from a testing program I had found else where, It never occurred to me check the hashing! – Window 16/9, 2011 at 22:14

If you want to try other hash functions, Google recently released CityHash ( code.google.com/p/cityhash ). – Heptavalent 16/9, 2011 at 22:18

@Steven: I'm sure, we all will appreciate updated performance results. At least if now TBB shows better results than other implementation – Bloodroot 18/9, 2011 at 19:18

Yes, TBB now performs better in cases where table modifications (insert/update/remove) are less than 70% of total operations. However anything higher, my table performs much better. (At 90%, we perform twice the ops per second), Currently I am rewriting the the algorithm to remove a few state variables, and other improvements. – Window 24/9, 2011 at 14:58

As a side note for those who are looking over the code, the probing version of the algorithm will no longer be further developed, because of a significant lose in performance due to extra overhead, and no significant memory was saved by probing. (Though it was a much more interesting and challenging wait-free algorithm) – Window 24/9, 2011 at 15:0

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.