What is the best way to use a HashMap in C++?
Asked Answered
J

5

274

I know that STL has a HashMap API, but I cannot find any good and thorough documentation with good examples regarding this.

Any good examples will be appreciated.

Jaunita answered 26/8, 2010 at 18:8 Comment(2)
Are you asking about the C++1x hash_map, or about the std::map ?Otherness
I want something like a java.util.HashMap in C++ and the standarized way to do it if there is one. Else the best non standard library. What do C++ developers commonly use when they need a HashMap?Jaunita
G
381

The standard library includes the ordered and the unordered map (std::map and std::unordered_map) containers. In an ordered map (std::map) the elements are sorted by the key, insert and access is in O(log n). Usually the standard library internally uses red black trees for ordered maps. But this is just an implementation detail. In an unordered map (std::unordered_map) insert and access is in O(1). It is just another name for a hashtable.

An example with (ordered) std::map:

#include <map>
#include <iostream>
#include <cassert>

int main(int argc, char **argv)
{
  std::map<std::string, int> m;
  m["hello"] = 23;
  // check if key is present
  if (m.find("world") != m.end())
    std::cout << "map contains key world!\n";
  // retrieve
  std::cout << m["hello"] << '\n';
  std::map<std::string, int>::iterator i = m.find("hello");
  assert(i != m.end());
  std::cout << "Key: " << i->first << " Value: " << i->second << '\n';
  return 0;
}

Output:

23
Key: hello Value: 23

If you need ordering in your container and are fine with the O(log n) runtime then just use std::map.

Otherwise, if you really need a hash-table (O(1) insert/access), check out std::unordered_map, which has a similar to std::map API (e.g. in the above example you just have to search and replace map with unordered_map).

The unordered_map container was introduced with the C++11 standard revision. Thus, depending on your compiler, you have to enable C++11 features (e.g. when using GCC 4.8 you have to add -std=c++11 to the CXXFLAGS).

Even before the C++11 release GCC supported unordered_map - in the namespace std::tr1. Thus, for old GCC compilers you can try to use it like this:

#include <tr1/unordered_map>

std::tr1::unordered_map<std::string, int> m;

It is also part of boost, i.e. you can use the corresponding boost-header for better portability.

Gynarchy answered 26/8, 2010 at 18:26 Comment(3)
While the standard library does not have a hash table-based container, almost all implementations include hash_map from the SGI STL in some form or another.Glade
@JamesMcNellis which is advised unordered_map or hash_map for HashMap implementationPolyphony
@ShameelMohamed, 2017, i.e. 6 years after C++11 it should be hard to find an STL that doesn't provide unordered_map. Thus, there is no reason to consider the non-standard hash_map.Gynarchy
B
43

A hash_map is an older, unstandardized version of what for standardization purposes is called an unordered_map (originally in TR1, and included in the standard since C++11). As the name implies, it's different from std::map primarily in being unordered -- if, for example, you iterate through a map from begin() to end(), you get items in order by key1, but if you iterate through an unordered_map from begin() to end(), you get items in a more or less arbitrary order.

An unordered_map is normally expected to have constant complexity. That is, an insertion, lookup, etc., typically takes essentially a fixed amount of time, regardless of how many items are in the table. An std::map has complexity that's logarithmic on the number of items being stored -- which means the time to insert or retrieve an item grows, but quite slowly, as the map grows larger. For example, if it takes 1 microsecond to lookup one of 1 million items, then you can expect it to take around 2 microseconds to lookup one of 2 million items, 3 microseconds for one of 4 million items, 4 microseconds for one of 8 million items, etc.

From a practical viewpoint, that's not really the whole story though. By nature, a simple hash table has a fixed size. Adapting it to the variable-size requirements for a general purpose container is somewhat non-trivial. As a result, operations that (potentially) grow the table (e.g., insertion) are potentially relatively slow (that is, most are fairly fast, but periodically one will be much slower). Lookups, which cannot change the size of the table, are generally much faster. As a result, most hash-based tables tend to be at their best when you do a lot of lookups compared to the number of insertions. For situations where you insert a lot of data, then iterate through the table once to retrieve results (e.g., counting the number of unique words in a file) chances are that an std::map will be just as fast, and quite possibly even faster (but, again, the computational complexity is different, so that can also depend on the number of unique words in the file).


1 Where the order is defined by the third template parameter when you create the map, std::less<T> by default.

Boloney answered 26/8, 2010 at 18:28 Comment(2)
I realize I'm coming 9 years after the answer was posted but... do you have a link to a doc that mentions the fact that an unordered map can shrink in size ? Usually, std collections only grow. Moreover, if you insert a lot of data but know in advance more or less how many keys you'll insert, you can specify the size of the map on creation, which basically nullifies the resize cost (cause there won't be any).Sarthe
@Zonko: Sorry, I didn't notice this when asked. As far as I know, an unordered_map doesn't shrink, except in response to calling rehash. When you call rehash, you specify a size for the table. That size will be used unless doing so would exceed the specified maximum load factor for the table (in which case, the size will be increased automatically to keep the load factor within limits).Boloney
P
23

Here's a more complete and flexible example that doesn't omit necessary includes to generate compilation errors:

#include <iostream>
#include <unordered_map>

class Hashtable {
    std::unordered_map<const void *, const void *> htmap;

public:
    void put(const void *key, const void *value) {
            htmap[key] = value;
    }

    const void *get(const void *key) {
            return htmap[key];
    }

};

int main() {
    Hashtable ht;
    ht.put("Bob", "Dylan");
    int one = 1;
    ht.put("one", &one);
    std::cout << (char *)ht.get("Bob") << "; " << *(int *)ht.get("one");
}

Still not particularly useful for keys, unless they are predefined as pointers, because a matching value won't do! (However, since I normally use strings for keys, substituting "string" for "const void *" in the declaration of the key should resolve this problem.)

Pentode answered 29/4, 2014 at 14:48 Comment(2)
I have to say, this example is a very bad practice in C++. You are using a strongly typed language and destroying it by using void*. For starters, there's no reason to wrap the unordered_map as it's part of the standard and reduces code maintainability. Next, if insist on wrapping it, use templates. That's exactly what they are for.Villareal
Strongly typed? You probably mean statically typed. The fact that he can go from const char ptr to void silently makes C++ statically, but not strongly, typed. There's types, but compiler won't say a thing unless you enable some obscure flag that most likely doesn't exist.Insecure
P
15

Evidence that std::unordered_map uses a hash map in GCC stdlibc++ 6.4

This was mentioned at: https://mcmap.net/q/23167/-what-is-the-best-way-to-use-a-hashmap-in-c but in the following answer: What data structure is inside std::map in C++? I have given further evidence of such for the GCC stdlibc++ 6.4 implementation by:

  • GDB step debugging into the class
  • performance characteristic analysis

Here is a preview of the performance characteristic graph described in that answer:

enter image description here

How to use a custom class and hash function with unordered_map

This answer nails it: C++ unordered_map using a custom class type as the key

Excerpt: equality:

struct Key
{
  std::string first;
  std::string second;
  int         third;

  bool operator==(const Key &other) const
  { return (first == other.first
            && second == other.second
            && third == other.third);
  }
};

Hash function:

namespace std {

  template <>
  struct hash<Key>
  {
    std::size_t operator()(const Key& k) const
    {
      using std::size_t;
      using std::hash;
      using std::string;

      // Compute individual hash values for first,
      // second and third and combine them using XOR
      // and bit shifting:

      return ((hash<string>()(k.first)
               ^ (hash<string>()(k.second) << 1)) >> 1)
               ^ (hash<int>()(k.third) << 1);
    }
  };

}
Plantaineater answered 19/12, 2017 at 22:31 Comment(0)
O
1

For those of us trying to figure out how to hash our own classes whilst still using the standard template, there is a simple solution:

  1. In your class you need to define an equality operator overload ==. If you don't know how to do this, GeeksforGeeks has a great tutorial https://www.geeksforgeeks.org/operator-overloading-c/

  2. Under the standard namespace, declare a template struct called hash with your classname as the type (see below). I found a great blogpost that also shows an example of calculating hashes using XOR and bitshifting, but that's outside the scope of this question, but it also includes detailed instructions on how to accomplish using hash functions as well https://prateekvjoshi.com/2014/06/05/using-hash-function-in-c-for-user-defined-classes/

namespace std {

  template<>
  struct hash<my_type> {
    size_t operator()(const my_type& k) {
      // Do your hash function here
      ...
    }
  };

}
  1. So then to implement a hashtable using your new hash function, you just have to create a std::map or std::unordered_map just like you would normally do and use my_type as the key, the standard library will automatically use the hash function you defined before (in step 2) to hash your keys.
#include <unordered_map>

int main() {
  std::unordered_map<my_type, other_type> my_map;
}
Observation answered 16/9, 2019 at 20:52 Comment(1)
Or check out this Stackoverflow question that is about hashing user defined classes ...Gynarchy

© 2022 - 2024 — McMap. All rights reserved.