How to do the equivalent of memset(this, ...) without clobbering the vtbl?
Asked Answered
S

7

7

I know that memset is frowned upon for class initialization. For example, something like the following:

class X { public: 
X() { memset( this, 0, sizeof(*this) ) ; }
...
} ;

will clobber the vtbl if there's a virtual function in the mix.

I'm working on a (humongous) legacy codebase that is C-ish but compiled in C++, so all the members in question are typically POD and require no traditional C++ constructors. C++ usage gradually creeps in (like virtual functions), and this bites the developers that don't realize that memset has these additional C++ teeth.

I'm wondering if there is a C++ safe way to do an initial catch-all zero initialization, that could be followed by specific by-member initialization where zero initialization isn't appropriate?

I find the similar questions memset for initialization in C++, and zeroing derived struct using memset. Both of these have "don't use memset()" answers, but no good alternatives (esp. for large structures potentially containing many many members).

Stagnate answered 15/10, 2012 at 19:28 Comment(9)
Generally you take the address of the first field and clear from there. No doubt this violates some rule, but it's always worked for me.Traceetracer
The other option is to write your own alloc routine for the class, so you can assure the space is cleared, but IIRC there are restrictions around that.Traceetracer
Don't you mean memset(this, 0, sizeof *this)? (And yes, that would also clobber the vtbl.)Adulterant
@HotLicks, it seems like the problem with that approach is knowing the correct size. sizeof(this) won't likely be the same as sizeof(<the data parts>).Overactive
You can take the address of the field and subtract this from that, then subtract the remainder from the object size.Traceetracer
@Hot Licks What you're suggesting works fine if you're only on a single compiler and carefully use that compiler's memory layout rules. For example, if your project is only ever compiled with MSVC, this can work. But, different compilers have different rules for vtables, etc (they could possibly even use something other than a vtable, theoretically). So, it could easily turn into a mess of scary if-defs if you need to use this sort of trick on multiple compilers.Adigranth
What's wrong with writing a constructor that explicitly initializes the members?Fibroma
This isn't just about the vtable, you could also have non-standard layout members, which memset would mess up as well. Make a choice: write in C (and gut the C++ portions) or write in C++ (and use C++ idioms). Mixing these two things is dangerous and it's probably easier (and superior) to start taking the time to fix these classes one at a time to give them proper constructors, without the C antics.Consistence
@Adigranth -- Like I said, it no doubt violates several rules, but it's always worked for me.Traceetracer
O
4

For each class where you find a memset call, add a memset member function which ignores the pointer and size arguments and does assignments to all the data members.

edit: Actually, it shouldn't ignore the pointer, it should compare it to this. On a match, do the right thing for the object, on a mismatch, reroute to the global function.

Overactive answered 15/10, 2012 at 19:39 Comment(1)
Yeah, there's some merit to that. Easier to remember to keep it updated than to manage lists in 5 different constructors.Traceetracer
L
2

Leverage the fact that a static instance is initialised to zero: https://ideone.com/GEFKG0

template <class T>
struct clearable
{
    void clear()
    {
        static T _clear;
        *((T*)this) = _clear;
    };
};

class test : public clearable<test>
{
    public:
        int a;
};

int main()
{
    test _test;
    _test.a=3;
    _test.clear();

    printf("%d", _test.a);

    return 0;
}

However the above will cause the constructor (of the templatised class) to be called a second time.

For a solution that causes no ctor call this can be used instead: https://ideone.com/qTO6ka

template <class T>
struct clearable
{
    void *cleared;
    clearable():cleared(calloc(sizeof(T), 1)) {}

    void clear()
    {
        *((T*)this) = *((T*)cleared);
    };
};

...and if you're using C++11 onwards the following can be used: https://ideone.com/S1ae8G

template <class T>
struct clearable
{
    void clear()
    {
        *((T*)this) = {};
    };
};
Locomobile answered 29/6, 2016 at 14:57 Comment(0)
P
1

You could always add constructors to these embedded structures, so they clear themselves so to speak.

Phil answered 15/10, 2012 at 19:34 Comment(1)
The question is really about what to put in the constructor (that isn't, say, 100 assignment statements).Stagnate
P
1

This is hideous, but you could overload operator new/delete for these objects (or in a common base class), and have the implementation provide zero'd out buffers. Something like this :

class HideousBaseClass
{
public:
    void* operator new( size_t nSize )
    {
        void* p = malloc( nSize );
        memset( p, 0, nSize );
        return p;
    }
    void operator delete( void* p )
    {
        if( p )
            free( p );
    }
};

One could also override the global new/delete operators, but this could have negative perf implications.

Edit: I just realized that this approach won't work for stack allocated objects.

Pileus answered 15/10, 2012 at 19:38 Comment(0)
B
1

Try this:

template <class T>
void reset(T& t)
{
   t = T();
}

This will zeroed your object - no matter it is POD or not.

But do not do this:

   A::A() { reset(*this); }

This will invoke A::A in infinite recursion!!!

Try this:

  struct AData { ... all A members };
  class  A { 
   public: 
      A() { reset(data); } 
   private: 
      AData data; 
   };
Burglary answered 15/10, 2012 at 19:42 Comment(4)
Is this what the boost value_initialized: boost.org/doc/libs/1_38_0/libs/utility/value_init.htm is doing?Stagnate
@PeeterJoot More or less it does. Actually it does it in more complicated way, This boost library is claiming that due to some compiler issues and because the way presented in my answer is not always the most efficient way for non POD types, it uses the static constant variable for this purposes. However if you seek for memset replacement - my "simplified" way should be enough.Burglary
t = T(); this won't zero the object, it will replace it with a default-constructed one. If the default constructor doesn't initialize a member, that member will still be left un-initialized. If it initializes something to non-zero, it will also be left at that value. Resetting to default instead of all zeros may actually be a good thing, but they are not the same as you suggest.Priory
@Priory - it will zero object in c++ sense: for POD types - all numbers to zeros, bool to false, pointers to nulls. For non-POD (the case you described) the default constructed object will be used - so if someone writes default constructor in wrong way - then no way - it will fail. But see at the example I provided - the AData is defined as simple aggregate, so if its members are POD or non-POD with valid default constructor - it will always work.Burglary
D
0

You can use pointer arithmetic to find the range of bytes you want to zero out:

class Thing {
public:
    Thing() {
        memset(&data1, 0, (char*)&lastdata - (char*)&data1 + sizeof(lastdata));
    }
private:
    int data1;
    int data2;
    int data3;
    // ...
    int lastdata;
};

(Edit: I originally used offsetof() for this, but a comment pointed out that this is only supposed to work on PODs, and then I realised that you can just use the member addresses directly.)

Drug answered 15/10, 2012 at 20:5 Comment(0)
K
0

The better solution I could find is to create a separated struct where you will put the members that must be memsetted to zero. Not sure if this design is suitable for you.

This struct got no vtable and extends nothings. It will be just a chunk of data. This way memsetting the struct is safe.

I have made an example:

#include <iostream>
#include <cstring>

struct X_c_stuff {
    X_c_stuff() {
        memset(this,0,sizeof(this));
    }
    int cMember;
};
class X : private X_c_stuff{
public:
    X() 
    : normalMember(3)
    {
        std::cout << cMember << normalMember << std::endl;
    }
private:
    int normalMember;
};

int main() {
    X a;
    return 0;
}
Kahlil answered 15/10, 2012 at 20:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.