Does C++11 provide hashing functions for std::type_info?
Asked Answered
U

2

8

I'm still working on a good solution to my One-Of-A-Type Container Problem -- and upon reflection I think it would be nice to be able to just use something like a std::map<std::type_info, boost::any>. Unfortunately, std::type_info does not define an operator<, and I think it'd be unreasonable for it to define one.

However, it does seem reasonable to define a hash function for it, because you could simply use the singleton address of the std::type_info object as a reasonable "hash". Therefore, you'd be able to put a std::type_info into a std::unordered_map as the key.

Does C++11 provide such a hash function? Would using the memory address of the std::type_info singleton be a bad hash strategy?

Unicorn answered 23/8, 2010 at 22:16 Comment(4)
It's not a singleton, by the way, but a statically allocated object.Raskin
@GMan: What's the difference?Unicorn
If it was a singleton, there would be exactly one type_info object. Since there are multiple types in a program, there must be more than one type_info object in the program.Antigua
@James: Ah-- I meant one singleton per type. My point was that I thought there would be at most one type_info per type.Unicorn
A
9

The fact that type_info is not less-than comparable isn't as much a problem for using it as a map key as the fact that type_info is non-copyable. :-)

In C++03, type_info has a before() member function that provides an ordering of type_info objects.

In C++11, type_info has a hash_code() member function (C++11 §18.7.1/7):

size_t hash_code() const throw();

Returns: an unspecified value, except that within a single execution of the program, it shall return the same value for any two type_info objects which compare equal.

Remark: an implementation should return different values for two type_info objects which do not compare equal.

type_info objects resulting from the typeid operator exist until the end of the program, so it is safe to use a type_info* as a map key. However, to the best of my knowledge, there is no guarantee that if you apply typeid to two objects of the same type you will get two references to the same type_info object.

If you do use type_info* as a map key, I'd use a custom comparator that dereferences the pointers and compares the type_info objects themselves (using the aforementioned before() or hash_code() for ordering).

Antigua answered 23/8, 2010 at 22:32 Comment(19)
@James: D'oh! Perhaps I can use std::type_info * (because there's only once instance of a particular type_info class?) ?Unicorn
@James: That could be difficult because std::type_info objects don't provide an efficient way to compare themselves. Looks like it's back to the home brew solution for me :/Unicorn
@Billy: Wait, what? They have operator==, and although two objects of the same type may not return the same instance of a type_info object, they must compare equal (and generated the same hash).Raskin
@GMan: Yes, but std::map requires operator<, not operator==. I suppose I can use std::unordered_map but I'm trying to leave as much in C++03 as reasonably possible. (That said, if there was an off the shelf way to solve this in C++0x I'd consider that, hence the question :) )Unicorn
@Billy: You can, however, write a bool compare(const type_info* x, const type_info* y) { return x->before(*y); }, or compare the hash codes returned by hash_code() using operator<.Antigua
However, upon re-reading the text from the C++0x FCD, I don't think I'd rely on hash_code() returning a unique value for two different type_info objects... that "should" word makes me very, very nervous. I'd stick with before().Antigua
@James: Ah, that 'should' is dubious. Makes sense though; in the same way name() can return 0;, so can hash_code(). I think before() is the best option here.Raskin
@James + @GMan: Then the real question is why the standard does not define std::type_info::operator< ...Unicorn
@Billy: Likely because, semantically, it shouldn't. What does it mean to say one type is less than another?Mehta
@Billy: I'd guess it's because while the ordering is guaranteed to be consistent within a single execution of a program, the ordering is not guaranteed to be consistent across separate executions. I don't think it's typical for objects to compare differently in different executions of a given program. However, that's kind of lame and I just made that up. I could be completely wrong. :-)Antigua
@Dennis: I don't know but apparently they have already thought of one, because there is a before member. In this case I don't even care about whether the objects are really less than -- I just need equivalence tests to work.Unicorn
The hash function isn't required to return unique values because it's a hash function. A hashing container must deal with hash collisions. Ideally, the compiler would generate a perfect hash function for the typeinfo objects as they're all known at compile time. The standard isn't requiring it, possibly because it could be non-trivial.Hola
@caspin: the compiler could do so (perfect hash) for a single library, but how would you coordinate multiple libraries, knowing that template types are shared among them while other types will only appear in one ? It gets tricky...Chromatography
@Matthieu: Well, the type_info objects for different types must be distinct (they cannot compare equal to each other), so they must occupy different locations in memory and if you are on a platform where sizeof (size_t) == sizeof (uintptr_t), then you can simply cast the address of the type_info struct and use that as its hash code. This works for most (but certainly not all) systems.Antigua
@James McNellis: I am not that confident, would a basic_string<char, allocator<char> > have one instance per library, and thus possibly one typeinfo per library... and thus a variety of possible addresses (one per library) ? I really think templates could mess things up here...Chromatography
@Matthieu: Oops; you're right. I was neglecting what I said earlier, that applying typeid to two instances of a type might yield different type_info instances that compare equal.Antigua
Please see the answer below. I have no idea why this answer is the accepted one, but type_index is clearly the way to go (since the question is about C++11, for C++03 this answer can be useful)Archery
@Archery Well, type_index was added in C++11, and this answer was both written and accepted a full year before C++11 was finished.Antigua
Waring: this answer is partially incorrect! According to en.cppreference.com/w/cpp/types/type_info/hash_code: "No other guarantees are given: type_info objects referring to different types may have the same hash_code (although the standard recommends that implementations avoid this as much as possible), and hash_code for the same type can change between invocations of the same program."Poly
A
13

You could also use type_index, it safely holds a pointer to a type_info, it's copyable, comparable and a hash function is provided for standard containers.

Academia answered 19/11, 2012 at 15:32 Comment(0)
A
9

The fact that type_info is not less-than comparable isn't as much a problem for using it as a map key as the fact that type_info is non-copyable. :-)

In C++03, type_info has a before() member function that provides an ordering of type_info objects.

In C++11, type_info has a hash_code() member function (C++11 §18.7.1/7):

size_t hash_code() const throw();

Returns: an unspecified value, except that within a single execution of the program, it shall return the same value for any two type_info objects which compare equal.

Remark: an implementation should return different values for two type_info objects which do not compare equal.

type_info objects resulting from the typeid operator exist until the end of the program, so it is safe to use a type_info* as a map key. However, to the best of my knowledge, there is no guarantee that if you apply typeid to two objects of the same type you will get two references to the same type_info object.

If you do use type_info* as a map key, I'd use a custom comparator that dereferences the pointers and compares the type_info objects themselves (using the aforementioned before() or hash_code() for ordering).

Antigua answered 23/8, 2010 at 22:32 Comment(19)
@James: D'oh! Perhaps I can use std::type_info * (because there's only once instance of a particular type_info class?) ?Unicorn
@James: That could be difficult because std::type_info objects don't provide an efficient way to compare themselves. Looks like it's back to the home brew solution for me :/Unicorn
@Billy: Wait, what? They have operator==, and although two objects of the same type may not return the same instance of a type_info object, they must compare equal (and generated the same hash).Raskin
@GMan: Yes, but std::map requires operator<, not operator==. I suppose I can use std::unordered_map but I'm trying to leave as much in C++03 as reasonably possible. (That said, if there was an off the shelf way to solve this in C++0x I'd consider that, hence the question :) )Unicorn
@Billy: You can, however, write a bool compare(const type_info* x, const type_info* y) { return x->before(*y); }, or compare the hash codes returned by hash_code() using operator<.Antigua
However, upon re-reading the text from the C++0x FCD, I don't think I'd rely on hash_code() returning a unique value for two different type_info objects... that "should" word makes me very, very nervous. I'd stick with before().Antigua
@James: Ah, that 'should' is dubious. Makes sense though; in the same way name() can return 0;, so can hash_code(). I think before() is the best option here.Raskin
@James + @GMan: Then the real question is why the standard does not define std::type_info::operator< ...Unicorn
@Billy: Likely because, semantically, it shouldn't. What does it mean to say one type is less than another?Mehta
@Billy: I'd guess it's because while the ordering is guaranteed to be consistent within a single execution of a program, the ordering is not guaranteed to be consistent across separate executions. I don't think it's typical for objects to compare differently in different executions of a given program. However, that's kind of lame and I just made that up. I could be completely wrong. :-)Antigua
@Dennis: I don't know but apparently they have already thought of one, because there is a before member. In this case I don't even care about whether the objects are really less than -- I just need equivalence tests to work.Unicorn
The hash function isn't required to return unique values because it's a hash function. A hashing container must deal with hash collisions. Ideally, the compiler would generate a perfect hash function for the typeinfo objects as they're all known at compile time. The standard isn't requiring it, possibly because it could be non-trivial.Hola
@caspin: the compiler could do so (perfect hash) for a single library, but how would you coordinate multiple libraries, knowing that template types are shared among them while other types will only appear in one ? It gets tricky...Chromatography
@Matthieu: Well, the type_info objects for different types must be distinct (they cannot compare equal to each other), so they must occupy different locations in memory and if you are on a platform where sizeof (size_t) == sizeof (uintptr_t), then you can simply cast the address of the type_info struct and use that as its hash code. This works for most (but certainly not all) systems.Antigua
@James McNellis: I am not that confident, would a basic_string<char, allocator<char> > have one instance per library, and thus possibly one typeinfo per library... and thus a variety of possible addresses (one per library) ? I really think templates could mess things up here...Chromatography
@Matthieu: Oops; you're right. I was neglecting what I said earlier, that applying typeid to two instances of a type might yield different type_info instances that compare equal.Antigua
Please see the answer below. I have no idea why this answer is the accepted one, but type_index is clearly the way to go (since the question is about C++11, for C++03 this answer can be useful)Archery
@Archery Well, type_index was added in C++11, and this answer was both written and accepted a full year before C++11 was finished.Antigua
Waring: this answer is partially incorrect! According to en.cppreference.com/w/cpp/types/type_info/hash_code: "No other guarantees are given: type_info objects referring to different types may have the same hash_code (although the standard recommends that implementations avoid this as much as possible), and hash_code for the same type can change between invocations of the same program."Poly

© 2022 - 2024 — McMap. All rights reserved.