tbb:concurrent_hash_map<K,V>: sample code for Intel Threading Building Blocks (TBB)
Asked Answered
C

1

4

Looking for sample code to use tbb::concurrent_hash_map<K,V> from Intel Threading Building Blocks (TBB).

I can insert, but I cannot seem to read the values back.

The official Intel documentation appears to be somewhat lacking on the sample code side.

Update

The best docs are in "Pro TBB: C++ Parallel Programming with Threading Building Blocks" by Voss. Download this book for free (it's public domain).

Ignore the Intel docs. They are essentially a collection of function signatures.

Commence answered 8/3, 2020 at 9:30 Comment(1)
My opinion: if Intel's official docs were not so opaque, they would have a higher adoption rate for their API, but I'm guessing Intel still wants to sell their TBB consulting services to enterprise customers. The free book is pretty much perfect as far as docs go.Commence
C
7

Intel TBB is open source, and on GitHub:

https://github.com/intel/tbb

To install TBB, I used vcpkg which is compatible with Linux, Windows and Mac. Yes, vcpkg is from Microsoft, but it is 100% cross-platform, open source, and very popular.

Linux:

./vcpkg search tbb              # Find the package.
./vcpkg install tbb:x64-linux   # Install the package.

Windows:

vcpkg search tbb                # Find the package.
vcpkg install tbb:x64-windows   # Install the package.

Compile:

  • Compatible with any modern compiler including MSVC, GCC, LLVM, Intel Compiler (ICC), etc. I used CMake for gcc.

Can also download the source and extract the headers and libraries into the source tree, this works just as well.

Code.

#include "tbb/concurrent_hash_map.h" // For concurrent hash map.

tbb::concurrent_hash_map<int, string> dict;
typedef tbb::concurrent_hash_map<int, string>::accessor dictAccessor; // See notes on accessor below.   

print("  - Insert key, method 1:\n");   
dict.insert({1,"k1"});
print("    - 1: k1\n");

print("  - Insert key, method 2:\n");
dict.emplace(2,"k2");
print("    - 2: k2\n");

string result;

{
    print("  - Read an existing key:\n");   
    dictAccessor accessor;
    const auto isFound = dict.find(accessor, 2);
    // The accessor functions as:
    // (a) a fine-grained per-key lock (released when it goes out of scope).
    // (b) a method to read the value.
    // (c) a method to insert or update the value.
    if (isFound == true) {
        print("    - {}: {}\n", accessor->first, accessor->second);
    }
}

{
    print("  - Atomically insert or update a key:\n");  
    dictAccessor accessor;
    const auto itemIsNew = dict.insert(accessor, 4);
    // The accessor functions as:
    // (a) a fine-grained per-key lock (released when it goes out of scope).
    // (b) a method to read the value.
    // (c) a method to insert or update the value.
    if (itemIsNew == true) {
        print("    - Insert.\n");
        accessor->second = "k4";
    }
    else {
        print("    - Update.\n");
        accessor->second = accessor->second + "+update";
    }
    print("    - {}: {}\n", accessor->first, accessor->second);     
}

{
    print("  - Atomically insert or update a key:\n");          
    dictAccessor accessor;
    const auto itemIsNew = dict.insert(accessor, 4);
    // The accessor functions as:
    // (a) a fine-grained per-key lock which is released when it goes out of scope.
    // (b) a method to read the value.
    // (c) a method to insert or update the value.
    if (itemIsNew == true) {
        print("    - Insert.\n");
        accessor->second = "k4";
    }
    else {
        print("    - Update.\n");
        accessor->second = accessor->second + "+update";
    }
    print("    - {}: {}\n", accessor->first, accessor->second);     
}

{
    print("  - Read the final state of the key:\n");            
    dictAccessor accessor;
    const auto isFound = dict.find(accessor, 4);
    print("    - {}: {}\n", accessor->first, accessor->second);
}

Printing uses {fmtlib} for printing; can replace with cout <<.

Output:

- Insert key, method 1:
  - 1: k1
- Insert key, method 2:
  - 2: k2
- Read an existing key:
  - 2: k2
- Atomically insert or update a key:
  - Insert.
  - 4: k4
- Atomically insert or update a key:
  - Update.
  - 4: k4+update
- Read the final state of the key:
  - 4: k4+update

Other hash maps

  • See: https://tessil.github.io/2016/08/29/benchmark-hopscotch-map.html
  • See: std::unordered_map. This has a more standard API, and is thread safe in many situations, see: unordered_map thread safety. Suggest using this, if possible, as it has a simpler API.
  • There is also the concurrent_unordered_map from Intel TBB. It is essentially the same thing, a key/value map. However, it is much older, much much lower level, and more difficult to use. One has to supply a hasher, a equality operator, and an allocator. There is no sample code anywhere, even in the official Intel docs. I never got it working, despite months of occasional attempts. It may be obsolete, as it is not mentioned in said free book (it only covers concurrent_hash_map). Not recommended.

Update: Reader/Writer Locks

There are actually two accessors, one is a read lock, one is a write lock:

  • const_accessor
  • accessor

If using find, use const_accessor which is a read lock. If using insert or erase, use accessor which is a write lock (i.e. it will wait until any reads are done, and block further reads until it is done).

This is effectively equivalent to a reader/writer lock, but on a single dictionary key in the dictonary, rather than the entire dictionary.

Update

Final part of the learning curve: for key writes, nothing happens until the accessor goes out of scope. So any locks are held for no more than a few machine instructions, probably using CAS (Compare And Swap).

Comparing this to a database, the scope of the accessor is like a transaction. When the accessor goes out of scope, the entire transaction is committed to the hashmap.

Update

The free book mentioned above has fantastic performance tips in the chapter on concurrent_hash_map.

Conclusion

The API for this hash map is powerful but somewhat awkward. However, it supports fine-grained, per-key locks on insert/update. Any locks are only held for a handful of machine instructions, using CAS. This is something that few other hashmaps can offer, in any language. Recommend starting with std::unordered_map for simplicity; it is thread safe as long as the two threads do not write to the same key. If blazingly fast performance is required, there is an option to either refactor, or write a compatible wrapper on top with [] accessors and insert_or_update().

Commence answered 8/3, 2020 at 9:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.