How to specialize std::hash<Key>::operator() for user-defined type in unordered containers?
Asked Answered
I

3

124

To support user-defined key types in std::unordered_set<Key> and std::unordered_map<Key, Value> one has to provide operator==(Key, Key) and a hash functor:

struct X { int id; /* ... */ };
bool operator==(X a, X b) { return a.id == b.id; }

struct MyHash {
  size_t operator()(const X& x) const { return std::hash<int>()(x.id); }
};

std::unordered_set<X, MyHash> s;

It would be more convenient to write just std::unordered_set<X> with a default hash for type X, like for types coming along with the compiler and library. After consulting

  • C++ Standard Draft N3242 §20.8.12 [unord.hash] and §17.6.3.4 [hash.requirements],
  • Boost.Unordered
  • g++ include\c++\4.7.0\bits\functional_hash.h
  • VC10 include\xfunctional
  • various related questions in Stack Overflow

it seems possible to specialize std::hash<X>::operator():

namespace std { // argh!
  template <>
  inline size_t 
  hash<X>::operator()(const X& x) const { return hash<int>()(x.id); } // works for MS VC10, but not for g++
  // or
  // hash<X>::operator()(X x) const { return hash<int>()(x.id); }     // works for g++ 4.7, but not for VC10 
}                                                                             

Given compiler support for C++11 is yet experimental---I did not try Clang---, these are my questions:

  1. Is it legal to add such a specialization to namespace std? I have mixed feelings about that.

  2. Which of the std::hash<X>::operator() versions, if any, is compliant with C++11 standard?

  3. Is there a portable way to do it?

Iridescent answered 16/11, 2011 at 20:5 Comment(2)
With gcc 4.7.2, I had to provide a global operator==(const Key, const Key)Pasol
Note that specialization of std::hash (unlike other things in std namespace) are discouraged by Google style guide; take it with a grain of salt.Ouachita
K
156

You are expressly allowed and encouraged to add specializations to namespace std*. The correct (and basically only) way to add a hash function is this:

namespace std {
  template <> struct hash<Foo>
  {
    size_t operator()(const Foo & x) const
    {
      /* your code here, e.g. "return hash<int>()(x.value);" */
    }
  };
}

(Other popular specializations that you might consider supporting are std::less, std::equal_to and std::swap.)

*) as long as one of the involved types is user-defined, I suppose.

Kneel answered 16/11, 2011 at 20:7 Comment(12)
while this is possible, would you in general recommend doing it this way? I'd prefer instantiation unorder_map<eltype, hash, equality> instead, to avoid ruining someone's day with funny ADL business. (Edit Pete Becker's advice on this topic)Cabretta
@sehe: If you have a hash functor lying around that is default-constructible, perhaps, but why? (Equality is easier, since you'd just implement member-operator==.) My general philosophy is that if the function is natural and essentially the only "correct" one (like lexicographic pair comparison), then I add it to std. If it's something peculiar (like unordered pair comparison), then I make it specific to a container type.Kneel
Should be size_t operator()(const Foo & x) const.Bawdry
It works on both compilers when adding a const modifier for the method. Thanks!Escent
FWIW, I posted my lightweight take on this (which, incidentally, in no way requires a function object; even if you chose to use one, it didn't need to be default constructible, AFAICT)Cabretta
@KerrekSB (or anyone else) I don't understand the "template<> struct hash<Foo>". Could you explain what it means and why you're doing this please ? ThxDissuasion
@Virus721: That's the syntax for template specialization.Kneel
I am not disagreeing, but where in the standard are we allowed and encouraged to add specializations to std?Cambric
@razeh: all over the place, e.g. hash, but also all sorts of traits... allocator_traits, pointer_traits, iterator_traits...Kneel
@Kerrek, I agree, but I was hoping for a chapter and verse reference to a place in the standard. I found the wording allowing the injection at 17.6.4.2.1 where it says it is not allowed "unless otherwise specified", but I haven't been able to find the "otherwise specified" part amid the 4000+ page specification.Cambric
@Cambric here you can read "A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly prohibited.". So this solution is ok.Corneliuscornell
This answer seems to have been copied to https://coderedirect.com/.... No credits given.Faustinafaustine
C
7

My bet would be on the Hash template argument for the unordered_map/unorder_set/... classes:

#include <unordered_set>
#include <functional>

struct X 
{
    int x, y;
    std::size_t gethash() const { return (x*39)^y; }
};

typedef std::unordered_set<X, std::size_t(*)(const X&)> Xunset;
typedef std::unordered_set<X, std::function<std::size_t(const X&)> > Xunset2;

int main()
{
    auto hashX = [](const X&x) { return x.gethash(); };

    Xunset  my_set (0, hashX);
    Xunset2 my_set2(0, hashX); // if you prefer a more flexible set typedef
}

Of course

  • hashX could just as well be a global static function
  • in the second case, you could pass that
    • the oldfashioned functor object (struct Xhasher { size_t operator(const X&) const; };)
    • std::hash<X>()
    • any bind expression satisfying the signature -
Cabretta answered 16/11, 2011 at 20:36 Comment(5)
I appreciate that you can write something that doesn't have a default constructor, but I always find that requiring each map construction to remember the additional argument is a bit of a burden -- quite a bit too much of a burden for my taste. I'm OK with an explicit template argument, though specialising std::hash is still the nicest way out :-)Kneel
user-defined types to the rescue :-) More seriously, I hope we'd slap them on the wrists already at the stage where their class contains a char*!Kneel
Hmm... do you have an example of how a hash specialization interferes via ADL? I mean, it's entirely plausible, but I have a hard time coming up with a problem case.Kneel
let us continue this discussion in chatCabretta
It's all fun and games until you need a std::unordered_map<Whatever, Xunset> and it doesn't work because your Xunset hasher type isn't default constructible.Rhombus
B
4

@Kerrek SB has covered 1) and 3).

2) Even though g++ and VC10 declare std::hash<T>::operator() with different signatures, both library implementations are Standard compliant.

The Standard does not specify the members of std::hash<T>. It just says that each such specialization must satisfy the same "Hash" requirements needed for the second template argument of std::unordered_set and so on. Namely:

  • Hash type H is a function object, with at least one argument type Key.
  • H is copy constructible.
  • H is destructible.
  • If h is an expression of type H or const H, and k is an expression of a type convertible to (possibly const) Key, then h(k) is a valid expression with type size_t.
  • If h is an expression of type H or const H, and u is an lvalue of type Key, then h(u) is a valid expression with type size_t which does not modify u.
Bawdry answered 16/11, 2011 at 20:24 Comment(2)
No, neither implementation is standard compliant, since they attempt to specialize std::hash<X>::operator() rather than std::hash<X> as a whole, and the signature of std::hash<T>::operator() is implementation-defined.Aqualung
@ildjarn: Clarified - I was talking about the library implementations, not the attempted specializations. Not sure which exactly the OP meant to ask.Bawdry

© 2022 - 2024 — McMap. All rights reserved.