Why isn't std::hash<T> specialized for char*?
Asked Answered
I

3

11

Why doesn't the C++ standard specify that std::hash<T> is specialized for char*, const char*, unsigned char*, const unsigned char*, etc? I.e., it would hash the contents of the C string until the terminating null is found.

Any harm in injecting my own specializations into the std namespace for my own code?

Irv answered 16/4, 2013 at 18:34 Comment(5)
I'd say char* and unsigned char* are unsafe in the context of hashing, others with const qualifier are fine. Do not inject into std::, just make specializations in your namespace.Outworn
What is the lifetime of the const char* that your std::hash stores? (std::hash stores its key values as well as hashing them). How do you distinguish between null terminated const char* and pointers to individual characters?Anima
std::hash is just a functor object. It hashes its argument. It doesn't store it.Irv
Perfectly valid question and the answers are supported by facts and references. Voting to reopen.Maeve
Related: std::hash value on char* value and not on memory address? - i.e. relevant if you want to know how to actually hash a C-string then ...Burschenschaft
K
13

Why doesn't the C++ standard specify that std::hash<T> is specialized for char*, const char*, unsigned char*, const unsigned char*, etc?

It looks like it originated from proposal N1456. (emphasis mine)

Some earlier hash table implementations gave char* special treatment: it specialized the default hash function to look at character array being pointed to, rather than the pointer itself. This proposal removes that special treatment. Special treatment makes it slightly easier to use hash tables for C string, but at the cost of removing uniformity and making it harder to write generic code. Since naive users would generally be expected to use std::basic_string instead of C strings, the cost of special treatment outweighs the benefit.

If I'm interpreting this correctly, the reasoning is that supporting C style strings would break code that generically acts on hashes of pointers.

Any harm in injecting my own specializations into the std namespace for my own code?

There is potential harm, yes.

  • In the future, anything you added to the std namespace could collide with a new symbol name.
  • In the present, anything you add to the std namespace could be a "better match" for other components of the standard library, silently breaking behavior.
Kearse answered 16/4, 2013 at 19:2 Comment(0)
M
6

char* (and its ilk) doesn't always mean string. They can be simple byte arrays or binary file dumps or any number of other things. If you mean string in C++, you generally use the "string" class.

As for creating your own, given the above it's a bad idea. For user defined types, though, it is acceptable to create specializations of the std:: functions in the std:: namespace.

Mccarley answered 16/4, 2013 at 18:40 Comment(5)
About the only time char* is treated as a pointer-to-null-terminated-buffer in the C++ standard library is in inherited-from-C code, std::string and in some std::ostream code. I cannot think of anywhere else?Anima
I know that char* doesn't always mean string. What I want to know is if your answer is what the C++ standard committee was thinking.Irv
it is acceptable to create specializations of the std:: functions in the std:: namespace. Is it?Liverish
It's hard to know what they were thinking, but I would assume that since their code has to be all things to all people, it would certainly have been a consideration. I believe specializing const char* is explicitly not allowed, but I can't find the reference at the momentMccarley
Citing from the C++0x N3000 draft, 17.6.3.2.1.1 (but the standards should contain a similar section): "[...] A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type of external linkage and the specialization meets the standard library requirements for the original template and is not explicitly prohibited."Mccarley
S
3

There is a standard specialization for pointer types, see here

template< class T > struct hash<T*>;

So, it can cover char* (as sequence of bytes not a C-style string) too.

If you mean a specialization for C-style strings, there's not technically a problem to implement that. But since there is a specialization for std::string in C++, it's not worth to have a specialization for C-style strings.

For second part of your question, you can inject everything in std namespace but, what do you gain? It's against the goal of namespaces. Have your own namespace territory.

Soares answered 16/4, 2013 at 19:4 Comment(1)
It's worth the effort some times, since if char* is used as the key, and if we use string hash , we will need to pay extra memory allocation and copy for each hash.Teleology

© 2022 - 2024 — McMap. All rights reserved.