std::any without RTTI, how does it work?
Asked Answered
M

3

47

If I want to use std::any I can use it with RTTI switched off. The following example compiles and runs as expected also with -fno-rtti with gcc.

int main()
{   
    std::any x;
    x=9.9;
    std::cout << std::any_cast<double>(x) << std::endl;
}

But how std::any stores the type information? As I see, if I call std::any_cast with the "wrong" type I got std::bad_any_cast exception as expected.

How is that realized or is this maybe only a gcc feature?

I found that boost::any did also not need RTTI, but I found also not how that is solved. Does boost::any need RTTI?.

Digging into the STL header itself gives me no answer. That code is nearly unreadable to me.

Misprint answered 16/7, 2018 at 12:13 Comment(4)
Boost has its own typeinfo that replaces RTTI, that's why boost::any does not need it. Generaly I do not see other possibility than implementing one's own typeinfo that does not depend on RTTIMelissiamelita
any has method type() that returns a type_info, does it really run without rtti?Boar
@bipll: No, exactly that function is switched of if RTTI is off. So under the hood, there is something which can generate typeid like information. But it seems to be the dark side of the implementation ;)Misprint
boost type_info source here: github.com/boostorg/core/blob/develop/include/boost/core/… enjoy :)Italy
R
61

TL;DR; std::any holds a pointer to a static member function of a templated class. This function can perform many operations and is specific to a given type since the actual instance of the function depends on the template arguments of the class.


The implementation of std::any in libstdc++ is not that complex, you can have a look at it:

https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/any

Basically, std::any holds two things:

  • A pointer to a (dynamically) allocated storage;
  • A pointer to a "storage manager function":
void (*_M_manager)(_Op, const any*, _Arg*);

When you construct or assign a new std::any with an object of type T, _M_manager points to a function specific to the type T (which is actually a static member function of class specific to T):

template <typename _ValueType, 
          typename _Tp = _Decay<_ValueType>,
          typename _Mgr = _Manager<_Tp>, // <-- Class specific to T.
          __any_constructible_t<_Tp, _ValueType&&> = true,
          enable_if_t<!__is_in_place_type<_Tp>::value, bool> = true>
any(_ValueType&& __value)
  : _M_manager(&_Mgr::_S_manage) { /* ... */ }

Since this function is specific to a given type, you don't need RTTI to perform the operations required by std::any.

Furthermore, it is easy to check that you are casting to the right type within std::any_cast. Here is the core of the gcc implementation of std::any_cast:

template<typename _Tp>
void* __any_caster(const any* __any) {
    if constexpr (is_copy_constructible_v<decay_t<_Tp>>) {
        if (__any->_M_manager == &any::_Manager<decay_t<_Tp>>::_S_manage) {
            any::_Arg __arg;
            __any->_M_manager(any::_Op_access, __any, &__arg);
            return __arg._M_obj;
        }
    }
    return nullptr;
}

You can see that it is simply an equality check between the stored function inside the object you are trying to cast (_any->_M_manager) and the manager function of the type you want to cast to (&any::_Manager<decay_t<_Tp>>::_S_manage).


The class _Manager<_Tp> is actually an alias to either _Manager_internal<_Tp> or _Manager_external<_Tp> depending on _Tp. This class is also used for allocation / construction of object for the std::any class.

Rabkin answered 16/7, 2018 at 13:10 Comment(15)
In a short: They store a pointer to a static instance of a templated function which is unique as the instance of that template depends on the given type. I am right?Misprint
@Misprint In short, yes ;)Rabkin
Note that this construct (a pointer to static function template instance being unique for each template type) breaks with some compiler optimizations, such as MSVC's /Gy combined with its linker's /OPT:ICF, see the note on this page).Kalie
@Kalie yes, a compiler with such optimisations would have to implement any differently. I don't know if MSVC lets you turn off RTTIPotheen
Note that this depends on the linker and loader coalescing multiple instantiations of the function and so doesn't work on MinGW across DLL boundaries (#45290796)Clinic
While this is a useful answer, you did not address the issue of how types are compared for equality upon any_cast, so that bad_any_cast can be thrown if necessary. Both points were explicitly mentioned by the OP. (The easiest approach it seems would be to equality-compare the stored function pointer to the one expected from the instantiation of any_cast.)Transhumance
Couldn't it be done even simpler, without a function pointer? Since when you cast you explicitly say you want it as a double, couldn't it just return the storage/memory where it constructed the double as type double? That is static, no need for a function pointer then.Scud
@Scud You need the function pointer for two reasons. First, to do the comparison with the originally stored function pointer, to be able to check for the cast. Second, which is more an optimization, you don't really know where the data is actually stored since you can optimize for small object (to avoid heap allocation) and that's the manager (the function pointer) that knows how to "fetch" the data.Rabkin
You said: “Since this function is specific to a given type, you don't need RTTI to perform the operations required by std::any.“ This isn’t true. If this function was, for example, void foo<T>::bar() {}, the compiler would be allowed to create one function for all types T. What in the standard makes the function unique to the type?Templeton
@Templeton That's not that simple, see e.g. #26534240 (TL;DR; this is not clear in the standard). This is gcc implementation, and according to this answer, both gcc and llvm will not do such optimization (since those are not considered standard by their developers, as far as I understand it). If MSVC allows for std::any without RTTI, then it probably uses something different.Rabkin
@Rabkin The reason for my comment is that, on Windows, at least clang and msvc do this optimization for a void foo<T>::bar() {} function when compiled with optimizations. When compiled without optimizations, they don't. I skipped testing std::any and wrote my own code and this was an issue. I'm trying to reproduce on godbolt...Templeton
@Rabkin Also, I think that fundamentally didn't answer the question: does the standard require function definitions to have different addresses? If the answer is "yes," then you can still have &foo<double>::bar not be the same address as &foo<int>::bar, but somehow still have foo<double>::bar() and foo<int>::bar() run the same code at the same address. I don't know how a compiler could make those two things work at the same time, without inventing a new mechanism, but they're free to do that as far as I can tell. OTOH, if the answer is "no," then all bets are off.Templeton
@Templeton The question I linked is about your exact question so I can’t give you more than what is in the answers. The first answer says that this if not clear on the standard and you can read it. There are quotes from both gcc and clang developers saying that such optimization is an abuse of the language. I can’t say more than they do. My smart targets and this question targets std::any from stdlibc++, if you are not satisfied with the answers to the linked question, you should comment on it or ask a new one.Rabkin
@Templeton To be a bit more clear - The standard answer to the defect (from the link) is [...] implementations are free to optimize within the constraints of the “as-if” rule., and what is not clear (according to gcc/clang developers) is the as-if rule in this case. I let you read the answer that contains quotes and comments about this. Regarding my answer, this is about std::any in gcc libstdc++ and the _S_manage function. Since this function is from stdlibc++, gcc is right to do what it wants as long as the observable behavior matches the standard for std::any.Rabkin
I mean, I disagree a bit. If you want to answer how GCC does things, you need a big asterisk on the answer that says, “this is specific to GCC, and may not work for other compilers with RTTI disabled.” If you want to answer how it works generally, you need to point to the standard, and maybe have an asterisk that major compilers don’t adhere to the standard. As it stands the question is about the standard (not specific to GCC), and it’s a big hole that your answer doesn’t work on Windows with 2 major compilers; the answer should be complete enough to be of practical use, even on Windows.Templeton
G
7

Manual implementation of a limited RTTI is not that hard. You're gonna need static generic functions. That much I can say without providing a complete implementation. here is one possibility:

class meta{
    static auto id(){
        static std::atomic<std::size_t> nextid{};
        return ++nextid;//globally unique
    };
    std::size_t mid=0;//per instance type id
public:
    template<typename T>
    meta(T&&){
        static const std::size_t tid{id()};//classwide unique
        mid=tid;
    };
    meta(meta const&)=default;
    meta(meta&&)=default;
    meta():mid{}{};
    template<typename T>
    auto is_a(T&& obj){return mid==meta{obj}.mid;};
};

This is my first observation; far from ideal, missing many details. One may use one instance of meta as a none-static data member of his supposed implementation of std::any.

Grania answered 16/7, 2018 at 12:46 Comment(5)
The question was: "How is that realized or is this maybe only a gcc feature?" That it can be done is not an answer and that it can be done is quite clear as the provided code compiles and work!Misprint
I did mention how: static generic functions. but details kill. lots of metadata can be embeded, and a wide range of implementations are possible. really hard to pick one of many possibilities for the answer. can The OP withstand that much?Grania
Curious why this has -5 when another answer saying the same thing in seemingly more vague and less thread-safe terms has +3...?Paletot
@Paletot thanks for your attention. originally missed the snippet, for I was reluctant to add. It was -3 then and should've stopped there. but that is the social avalanche effect, people just follow each other. I have already pointed out the better answer in a comment: the accepted one. and don't really care about votes. plz read holt's.Grania
I voted up. This is a good technical background. This answer has been recklessly down voted for not good reasons.Liven
M
4

One of possible solutions is generating unique id for every type possibly stored in any (I assume You know moreless how any internally works). The code that can do it may look something like this:

struct id_gen{
    static int &i(){
        static int i = 0;
        return i;
    }

    template<class T>
    struct gen{
        static int id() {
            static int id = i()++;
            return id;
        }
    };    
};

With this implemented You can use the id of the type instead of RTTI typeinfo to quickly check the type.

Notice the usage of static variables inside functions and static functions. This is done to avoid the problem of undefined order of static variable initialization.

Melissiamelita answered 16/7, 2018 at 12:55 Comment(4)
you'd need to put the initialisation of ig_gen::i into its own compilation unit I thinkItaly
@RichardHodges I am really unsure about it, maybe You are right. Anyway, as this proof of concept shows any without RTTI is possible. Feel free to edit my answerMelissiamelita
See static initialization order fiasco.Clinic
@Clinic This is not the exact static initialization order fiasco situation, but anyway You are right, the order of static initialization is undefined. I change the answer.Melissiamelita

© 2022 - 2024 — McMap. All rights reserved.