Circumventing RTTI on legacy code

Asked 25/8, 2014 at 23:15 Answered 26/8, 2014 at 0:10

I have been looking for a way to get around the slowness of the dynamic cast type checking. Before you start saying I should redesign everything, let me inform you that the design was decided on 5 years ago. I can't fix all 400,000 lines of code that came after (I wish I could), but I can make some changes. I have run this little test on type identification:

#include <iostream>
#include <typeinfo>
#include <stdint.h>
#include <ctime>

using namespace std;

#define ADD_TYPE_ID \
    static intptr_t type() { return reinterpret_cast<intptr_t>(&type); }\
    virtual intptr_t getType() { return type(); }

struct Base
{
    ADD_TYPE_ID;
};

template <typename T>
struct Derived : public Base
{
    ADD_TYPE_ID;
};

int main()
{
    Base* b = new Derived<int>();
    cout << "Correct Type: " << (b->getType() == Derived<int>::type()) << endl; // true
    cout << "Template Type: " << (b->getType() == Derived<float>::type()) << endl; // false
    cout << "Base Type: " << (b->getType() == Base::type()) << endl; // false

    clock_t begin = clock();
    {
        for (size_t i = 0; i < 100000000; i++)
        {
            if (b->getType() == Derived<int>::type())
                Derived <int>* d = static_cast<Derived<int>*> (b);
        }
    }
    clock_t end = clock();
    double elapsed = double(end - begin) / CLOCKS_PER_SEC;

    cout << "Type elapsed: " << elapsed << endl;

    begin = clock();
    {
        for (size_t i = 0; i < 100000000; i++)
        {
            Derived<int>* d = dynamic_cast<Derived<int>*>(b);
            if (d);
        }
    }
    end = clock();
    elapsed = double(end - begin) / CLOCKS_PER_SEC;

    cout << "Type elapsed: " << elapsed << endl;

    begin = clock();
    {
        for (size_t i = 0; i < 100000000; i++)
        {
            Derived<int>* d = dynamic_cast<Derived<int>*>(b);
            if ( typeid(d) == typeid(Derived<int>*) )
                static_cast<Derived<int>*> (b);
        }
    }
    end = clock();
    elapsed = double(end - begin) / CLOCKS_PER_SEC;

    cout << "Type elapsed: " << elapsed << endl;

   return 0;
}

It seems that using the class id (first times solution above) would be the fastest way to do type-checking at runtime. Will this cause any problems with threading? Is there a better way to check for types at runtime (with not much re-factoring)?

Edit: Might I also add that this needs to work with the TI compilers, which currently only support up to '03

Foss answered 25/8, 2014 at 23:15 Comment(11)

Get an account on Career Overflow? – Amati 25/8, 2014 at 23:17

I have no idea what if ( typeid(d) == typeid(Derived<int>*) ) is intended to do. Besides, none of your tests seems to have side effects, they could all be dropped by the optimizer. – Amish 25/8, 2014 at 23:19

@KerrekSB You mean the OP should quit the job, before starting to struggle with that legacy code? – Centro 25/8, 2014 at 23:20

@πάνταῥεῖ: Well, if I were told I had to make 400k loc of slow legacy code fast, I would like to know my options if this fails... – Amati 25/8, 2014 at 23:23

I don't think you need to invent your own type IDs. You should be able to just use typeid(Base) etc. Those are static already. (That's basically how boost::any does it.) – Amati 25/8, 2014 at 23:24

@KerrekSB "Well, if I were told I had to make 400k loc ..." I'd try to develop some tools to support refactoring (estimate efforts and success chances respectively) ;) ... – Centro 25/8, 2014 at 23:24

As I thought, the second and third test seem to be dropped entirely by the optimizer (clang++, g++). The first one isn't, probably due to the virtual function call. – Amish 25/8, 2014 at 23:26

Agreed with @dyp, the third variant seems pointless. You've already done the dynamic cast. – Amati 25/8, 2014 at 23:37

@dyp: By adding volatiles in suitable places, you can get the loops to execute. – Amati 25/8, 2014 at 23:37

I am not very familiar with the optimizer. Are there any good resources that you would recommend for brushing up on how it works? The system was designed to have a python "glue" but that eventually became c++, so there are generic types being passed between blocks. The block each cast them to more specific types. It is a mess of casting with too many "we just know"s in it. That career overflow is sounding better every day. – Foss 25/8, 2014 at 23:52

It's hard to tell if it would suit your use case, but it's possible to homebrew RTTI by relying on the one definition rule. – Sphincter 26/8, 2014 at 0:39

First off, note that there's a big difference between dynamic_cast and RTTI: The cast tells you whether you can treat a base object as some further derived, but not necessarily most-derived object. RTTI tells you the precise most-derived type. Naturally the former is more powerful and more expensive.

So then, there are two natural ways you can select on types if you have a polymorphic hierarchy. They're different; use the one that actually applies.

void method1(Base * p)
{
    if (Derived * q = dynamic_cast<Derived *>(p))
    {
        // use q
    }
}

void method2(Base * p)
{
    if (typeid(*p) == typeid(Derived))
    {
        auto * q = static_cast<Derived *>(p);

        // use q
    }
}

Note also that method 2 is not generally available if the base class is a virtual base. Neither method applies if your classes are not polymorphic.

In a quick test I found method 2 to be significantly faster than your manual ID-based solution, which in turn is faster than the dynamic cast solution (method 1).

Amati answered 26/8, 2014 at 0:10 Comment(6)

I mostly said "Circumventing RTTI" because on the TI compiler and a few others, just having RTTI on slows everything down a lot. It would be nice to turn it off altogether. I think I might implementing a solution that is easily swapped out so I can test the whole systems speed with the different solutions. – Foss 26/8, 2014 at 15:57

Oh, never mind. I copied the example here and changed the stuff with Base and Derived forgetting your comment about the need for virtual. – Foss 4/9, 2014 at 18:48

Everything seems to be working. With this and many many other changes, I have been able to cut the runtime almost in half! Thanks for your help. – Foss 4/9, 2014 at 19:20

@CoryB: Glad it was useful :-) – Amati 4/9, 2014 at 19:26

The cast tells you whether you can treat a base object as some further derived didn't you mean an (unknown) object pointed to by a base*? – Cibis 12/10, 2014 at 14:57

@Walter: What's the difference? An object pointed to by a base * is necessarily an object of type base. – Amati 12/10, 2014 at 17:28

How about comparing the classes' virtual function tables?

Quick and dirty proof of concept:

void* instance_vtbl(void* c)
{
    return *(void**)c;
}

template<typename C>
void* class_vtbl()
{
    static C c;
    return instance_vtbl(&c);
}

// ...

begin = clock();
{
    for (size_t i = 0; i < 100000000; i++)
    {
        if (instance_vtbl(b) == class_vtbl<Derived<int>>())
            Derived <int>* d = static_cast<Derived<int>*> (b);
    }
}
end = clock();
elapsed = double(end - begin) / CLOCKS_PER_SEC;

cout << "Type elapsed: " << elapsed << endl;

With Visual C++'s /Ox switch, this appears 3x faster than the type/getType trick.

Slate answered 25/8, 2014 at 23:26 Comment(2)

The reason why this is faster is that it doesn't use a virtual function call, I suppose. The tests still have no side effects and cannot be used to compare the different approaches. This approach is not a portable solution (strictly speaking, probably undefined behaviour due to aliasing). I guess it indirectly relies on RTTI, since otherwise the vtable of two classes could be identical (=> is it reliable?) – Amish 25/8, 2014 at 23:30

This has problems with any types that aren't default constructable as well as being UB. (Also fails for types without vtables, but meh) – Originative 26/8, 2014 at 0:30

Given this type of code

class A {
};

class B : public A {
}

A * a;
B * b = dynamic_cast<B*> (a);
if( b != 0 ) // do something B specific

The polymorphic (right?) way to fix it is something like this

class A {
public:
    virtual void specific() { /* do nothing */ }
};

class B : public A {
public:
    virtual void specific() { /* do something B specific */ }
}

A * a;
if( a != 0 ) a->specific();

Beaker answered 25/8, 2014 at 23:34 Comment(0)

When MSVC 2005 first came out, dynamic_cast<> for 64-bit code was much slower than for 32-bit code. We wanted a quick and easy fix. This is what our code looks like. It probably violates all kinds of good design rules, but the conversion to remove dynamic_cast<> can be automated with a script.

class dbbMsgEph {
public:
    virtual dbbResultEph *              CastResultEph() { return 0; }
    virtual const dbbResultEph *        CastResultEph() const { return 0; }
};

class dbbResultEph : public dbbMsgEph {
public:
    virtual dbbResultEph *              CastResultEph() { return this; }
    virtual const dbbResultEph *        CastResultEph() const { return this; }
    static dbbResultEph *               Cast( dbbMsgEph * );
    static const dbbResultEph *         Cast( const dbbMsgEph * );
};

dbbResultEph *
dbbResultEph::Cast( dbbMsgEph * arg )
{
    if( arg == 0 ) return 0;
    return arg->CastResultEph();
}

const dbbResultEph *
dbbResultEph::Cast( const dbbMsgEph * arg )
{
    if( arg == 0 ) return 0;
    return arg->CastResultEph();
}

When we used to have

dbbMsgEph * pMsg;
dbbResultEph * pResult = dynamic_cast<dbbResultEph *> (pMsg);

we changed it to

dbbResultEph * pResult = dbbResultEph::Cast (pMsg);

using a simple sed(1) script. And virtual function calls are pretty efficient.

Beaker answered 25/8, 2014 at 23:44 Comment(0)

//in release module(VS2008) this is true：

cout << "Base Type: " << (b->getType() == Base::type()) << endl;

I guess it's because the optimization.So I change the implementation of Derived::type()

template <typename T>
struct Derived : public Base
{
    static intptr_t type() 
    { 
        cout << "different type()" << endl;
        return reinterpret_cast<intptr_t>(&type); 
    }
    virtual intptr_t getType() { return type(); }
};

Then it's different.So how to deal with it if use this method???

Beverlybevers answered 25/8, 2014 at 23:55 Comment(1)

Are you perhaps seeing the effects of identical COMDAT folding? – Gaudery 26/8, 2014 at 0:10

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags