Why is std::hash not an overloaded function?
Asked Answered
N

2

11

I'm guessing std::hash is defined as a template struct in order to avoid implicit type conversions done during overloaded function resolution. Is it a correct thing to say?

I mean, I would prefer to write

std::string s;
size_t hash = std::hash(s);

instead of

std::string s;
size_t hash = std::hash<std::string>()(s);

But I'm guessing there's a reason for why the standards committee has chosen the second option.

Norry answered 7/7, 2013 at 21:34 Comment(1)
@RiaD Right, thanks. Suddenly, it feels even more awkward.Norry
W
14

It's impossible to partially specialize function template, and so for user-defined templated classes, the would be no way to specialize std::hash if it was a function. (you can only specialize templates from std namespace, but not overload, so users of your templated class couldn't create std::unordered_map<MyClass<whatever>, MyOtherClass>, they would be forced to choose std::unordered_map<MyClass<whatever>, MyOtherClass, ???>). So functor is solution here.

namespace std
{
    template<typename T>
    struct hash<MyVector<T>>
    {
        size_t operator()(const MyVector<T>& v)
        {
            //return hash from here
        }
    };
}

The alternative way for standard library would be using some SFINAE template trick to choose member .hash() as the default, and standard hash if the other case, but in most cases you can't retrofit interfaces (especially if using third-party code)

The other alternative would be like std::swap does (tricks with ADL):

//somewhere in std::unordered_map
using std::hash;
size_t h = hash(key);

In my experience, ADL is tricky and not everyone remembers about corner cases. Furthermore, the advantage of functors here is fact that you can use them as a template parameter, so you can just plug-in another functor for template (like std::unordered_map<A, B, specialized_hash<A>>) if you think that the default is wrongly suited for your case.

From comments:

But could you elaborate a bit more about std::swap? It's still there in C++11 and it has no problems with user-defined types, has it? Why keep a lot of different concepts in STL rather than making it more consistent?

There's a little difference between std::swap and std::hash:

In std::hash:

  • it's likely that, e.g. std::string hash defined by class author won't be enough for your case, that is, it's too generic, and you can guarantee in your hash map that you'll put strings of only one kind, so you can provide hash function that's faster or/and having less collisions.
  • there are many kind of hashes for different purposes so genericity is far less important here
  • in most cases it's possible for you to create a better hash

In std::swap:

  • it's unlikely that you'll want your own swap function, but you'll still probably want to use the one specific for this class, not the generic std::swap one that calls copy constructors.
  • in most cases it's not even possible for you to create a swap function, as it requires knowledge of the class internals (for example, std::vector can be implemented as dynamic array with pointer hidden as private field, so you won't be able to access them, not alone swap them, and even the fact that's implemented that way is not guaranteed)
  • there is (or should be) only one swap.
  • actually, there is a problem with std::swap: standard containers provide swap member function, std::swap can be specialized (but only for non-templated class), and swap can be defined as a free function that's found with ADL. How should you provide your own swap? IMO that is confusing, not the fact that std::swap is function and std::hash is a functor.

Why is STL inconsistent? I can only guess here, but main reason why STL is inconsistent is (a) bawkward compatibility and (b) C++ also is quite inconsistent.

Wilks answered 7/7, 2013 at 21:54 Comment(8)
@milleniumbug: Do you have a standard reference for where it says you can't overload functions defined in the std namespace? I would have thought ADL would find your hash function.Salzman
@user1131467 17.4.3.1/1 It is undefined for a C++ program to add declarations or definitions to namespace std or namespaces with namespace std unless otherwise specified. A program may add template specializations for any standard library template to namespace std. Such a specialization (complete or partial) of a standard library results in undefined behaviour unless the declaration depends on a user-defined name of external linkage and unless the template specialization meets the standard library requirements for the original template.(Copied from https://mcmap.net/q/16168/-how-to-overload-std-swap )Wilks
Sure, but if you define hash(MyType) in your namespace, and then std::unordered_map<T> called hash(x) from inside the std namespace, then std::unordered_map<MyType> would use your hash function due to ADL (this is one of the main points of it). The question was why does the standard library use a template functor rather than an overloaded function as just described.Salzman
@user1131467 I just have updated my answer - in short, functor has advantage that not only class creator can create its own hasher, but users of the class can override it if they find the default inappropriate (just like with allocators)Wilks
Sorry, try again: This proposal discusses how to avoid opening std namespace for hash specialization and how to use ADL: linkMulley
@Wilks I think I get the message that std::hash is supposed to be used as a template argument and not as a function. But could you elaborate a bit more about std::swap? It's still there in C++11 and it has no problems with user-defined types, has it? Why keep a lot of different concepts in STL rather than making it more consistent?Norry
@android Since the comment field was too short, I edited my answer and put the response there.Wilks
@JohanLundberg The code listing is about imaginary another-universe C++ standard library which would have hash as a function, just like OP suggests. The rest of the answer is about why this approach would be worse than what we have.Wilks
C
4

One possible reason is that in this way it's easier to use it in templates as default by changeable option

template <typrname T, typename = std::hash<T>...>
class unordered_set;

BTW you may create function that works this way

template<typename T, typename... Args>
auto hasher(Args&&... args) -> whatever { 
    return std::hash<T>(std::forward<Args>(args)...)( //maybe some &'s  skipped
}

or (to allow detect type)

template<typename T, typename... Args> 
auto hasher(T t) -> whatever {
    return std::hash<T>()(t);
}
Concurrent answered 7/7, 2013 at 21:35 Comment(5)
struct std::hasher<T> { size_t operator()(T &&x) { return std::hash(std::forward<T>(x)) } }; or something to that effect? Equality is done in a similar fashion, operator== is overloaded and std::equal_to<T> does (by default) delegate to operator==.Minestrone
It's possible, but I don't think it's so important since hash is not so usable feature(Usually data structures like sets need it(especially in generic way) only in. Ad if it isn't so important why we need 1 more level of indirection. As of == (in C): it was introduced before std::equal_to and equal_to was added to already existent featureConcurrent
As OP showed, it makes calling hash much simpler when you don't care about allowing a replacement. As for indirections: Have you looked at the standard library? ;-)Minestrone
It's possible they didn't think about direct using at all:)Concurrent
@delnan, it may be inverted btw:)Concurrent

© 2022 - 2024 — McMap. All rights reserved.