C++ object size with virtual methods
Asked Answered
S

6

29

I have some questions about the object size with virtual.

1) virtual function

class A {
    public:
       int a;
       virtual void v();
    }

The size of class A is 8bytes....one integer(4 bytes) plus one virtual pointer(4 bytes) It's clear!

class B: public A{
    public:
       int b;
       virtual void w();
}

What's the size of class B? I tested using sizeof B, it prints 12

Does it mean that only one vptr is there even both of class B and class A have virtual function? Why there is only one vptr?

class A {
public:
    int a;
    virtual void v();
};

class B {
public:
    int b;
    virtual void w();
};

class C :  public A, public B {
public:
    int c;
    virtual void x();
};

The sizeof C is 20........

It seems that in this case, two vptrs are in the layout.....How does this happen? I think the two vptrs one is for class A and another is for class B....so there is no vptr for the virtual function of class C?

My question is, what's the rule about the number of vptrs in inheritance?

2) virtual inheritance

    class A {
    public:
        int a;
        virtual void v();
    };

    class B: virtual public A{                  //virtual inheritance 
    public:
        int b;
        virtual void w();
    };

    class C :  public A {                      //non-virtual inheritance
    public:
        int c;
        virtual void x();
    };

class D: public B, public C {
public:
    int d;
    virtual void y();
};

The sizeof A is 8 bytes -------------- 4(int a) + 4 (vptr) = 8

The sizeof B is 16 bytes -------------- Without virtual it should be 4 + 4 + 4 = 12. why there is another 4 bytes here? What's the layout of class B ?

The sizeof C is 12 bytes. -------------- 4 + 4 + 4 = 12. It's clear!

The sizeof D is 32 bytes -------------- it should be 16(class B) + 12(class C) + 4(int d) = 32. Is that right?

    class A {
    public:
        int a;
        virtual void v();
    };

    class B: virtual public A{                       //virtual inheritance here
    public:
        int b;
        virtual void w();
    };

    class C :  virtual public A {                    //virtual inheritance here
    public:
        int c;
        virtual void x();
    };

  class D: public B, public C {
   public:
        int d;
        virtual void y();
    };

sizeof A is 8

sizeof B is 16

sizeof C is 16

sizeof D is 28 Does it mean 28 = 16(class B) + 16(class C) - 8(class A) + 4 ( what's this? )

My question is , why there is an extra space when virtual inheritance is applied?

What's the underneath rule for the object size in this case?

What's the difference when virtual is applied on all the base classes and on part of the base classes?

Schwerin answered 10/1, 2010 at 21:33 Comment(2)
Any answer here is pure speculation. Each compiler can and does do it slightly differently. Since the standard does not specify anything about how it is implemented it is pointless wondering about it (unless you plan to write a compiler, in which case this is the wrong place to ask the question).Szechwan
Martin, writing a compiler is inherently a programming topic, and so this is not the wrong place to ask. Even people who aren't writing compilers can be curious about how things work, in particular if they were hoping to manage the size of an object and were under the mistaken impression that the size would be proportional to the number of virtual methods. Any answer here is free to make reference to one or more specific compilers, in which case it wouldn't be speculation at all.Frey
T
24

This is all implementation defined. I'm using VC10 Beta2. The key to help understanding this stuff (the implementation of virtual functions), you need to know about a secret switch in the Visual Studio compiler, /d1reportSingleClassLayoutXXX. I'll get to that in a second.

The basic rule is the vtable needs to be located at offset 0 for any pointer to an object. This implies multiple vtables for multiple inheritance.

Couple questions here, I'll start at the top:

Does it mean that only one vptr is there even both of class B and class A have virtual function? Why there is only one vptr?

This is how virtual functions work, you want the base class and derived class to share the same vtable pointer (pointing to the implementation in the derived class.

It seems that in this case, two vptrs are in the layout.....How does this happen? I think the two vptrs one is for class A and another is for class B....so there is no vptr for the virtual function of class C?

This is the layout of class C, as reported by /d1reportSingleClassLayoutC:

class C size(20):
        +---
        | +--- (base class A)
 0      | | {vfptr}
 4      | | a
        | +---
        | +--- (base class B)
 8      | | {vfptr}
12      | | b
        | +---
16      | c
        +---

You are correct, there are two vtables, one for each base class. This is how it works in multiple inheritance; if the C* is casted to a B*, the pointer value gets adjusted by 8 bytes. A vtable still needs to be at offset 0 for virtual function calls to work.

The vtable in the above layout for class A is treated as class C's vtable (when called through a C*).

The sizeof B is 16 bytes -------------- Without virtual it should be 4 + 4 + 4 = 12. why there is another 4 bytes here? What's the layout of class B ?

This is the layout of class B in this example:

class B size(20):
        +---
 0      | {vfptr}
 4      | {vbptr}
 8      | b
        +---
        +--- (virtual base A)
12      | {vfptr}
16      | a
        +---

As you can see, there is an extra pointer to handle virtual inheritance. Virtual inheritance is complicated.

The sizeof D is 32 bytes -------------- it should be 16(class B) + 12(class C) + 4(int d) = 32. Is that right?

No, 36 bytes. Same deal with the virtual inheritance. Layout of D in this example:

class D size(36):
        +---
        | +--- (base class B)
 0      | | {vfptr}
 4      | | {vbptr}
 8      | | b
        | +---
        | +--- (base class C)
        | | +--- (base class A)
12      | | | {vfptr}
16      | | | a
        | | +---
20      | | c
        | +---
24      | d
        +---
        +--- (virtual base A)
28      | {vfptr}
32      | a
        +---

My question is , why there is an extra space when virtual inheritance is applied?

Virtual base class pointer, it's complicated. Base classes are "combined" in virtual inheritance. Instead of having a base class embedded into a class, the class will have a pointer to the base class object in the layout. If you have two base classes using virtual inheritance (the "diamond" class hierarchy), they will both point to the same virtual base class in the object, instead of having a separate copy of that base class.

What's the underneath rule for the object size in this case?

Important point; there are no rules: the compiler can do whatever it needs to do.

And a final detail; to make all these class layout diagrams I am compiling with:

cl test.cpp /d1reportSingleClassLayoutXXX

Where XXX is a substring match of the structs/classes you want to see the layout of. Using this you can explore the affects of various inheritance schemes yourself, as well as why/where padding is added, etc.

Tasse answered 10/1, 2010 at 21:54 Comment(3)
I would like to point out that while this is a really excellent answer, these details are specific to a particular implementation. Implementations are allowed to differ wildly from this as long as they make the feature work. There are no rules or guarantees about how the memory of a class is layed out. This answer does represent a set of implementation techniques that are very similar on several different compilers and on several different platforms, but there can be and are major differences for some compilers on some platforms.Rodina
While writing objects to a file, I directly write the entire object to file using write(). So virtual adds 4 more bytes (or 8 more bytes depending on the architecture) while writing. Is there any way we can avoid this?Sumption
Oh My god, this commend totally changed my understanding of "virtual". So virtual function and virtual inheritance is totally different in memory layout!Jest
B
3

A good way to think about it is to understand what has to be done to handle up-casts. I'll try to answer your questions by showing the memory layout of objects of the classes you describe.

Code sample #2

The memory layout is as follows:

vptr | A::a | B::b

Upcasting a pointer to B to type A will result in the same address, with the same vptr being used. This is why there's no need for additional vptr's here.

Code sample #3

vptr | A::a | vptr | B::b | C::c

As you can see, there are two vptr's here, just like you guessed. Why? Because it's true that if we upcast from C to A we don't need to modify the address, and thus can use the same vptr. But if we upcast from C to B we do need that modification, and correspondingly we need a vptr at the start of the resulting object.

So, any inherited class beyond the first will require an additional vptr (unless that inherited class has no virtual methods, in which case it has no vptr).

Code sample #4 and beyond

When you derive virtually, you need a new pointer, called a base pointer, to point to the location in the memory layout of the derived classes. There can be more than one base pointer, of course.

So how does the memory layout look? That depends on the compiler. In your compiler it's probably something like

vptr | base pointer | B::b | vptr | A::a | C::c | vptr | A::a
          \-----------------------------------------^

But other compilers may incorporate base pointers in the virtual table (by using offsets - that deserves another question).

You need a base pointer because when you derive in a virtual fashion, the derived class will appear only once in the memory layout (it may appear additional times if it's also derived normally, as in your example), so all its children must point to the exact same location.

EDIT: clarification - it all really depends on the compiler, the memory layout I showed can be different in different compilers.

Bareback answered 10/1, 2010 at 22:0 Comment(4)
This is pure speculation. Each compiler can and does do it slightly differently.Szechwan
thanks, I don't understand your comments for Code sample #4. Could you explain a little bit why it is like this? vptr | base pointer | B::b | vptr | A::a | C::c | vptr | A::a \-----------------------------------------^Schwerin
@Martin: you're right, of course, but I think this structure is common across the popular compilers. I've edited to answer to clarify, though.Bareback
@skydoor: I tried to show an arrow from the "base pointer" to the 3rd vptr, I guess I failed to show what I meant. The point is that classes that inherit in a virtual way must have some sort of pointer (but not necessarily an actual pointer - see my note on offsets) to point to the virtual base class. Otherwise, imagine what would happen when casting B to A in the case the object's actual type is C, as opposed to the case the actual type is B. Compile-wise the cast is the same but the "distance" of A is different in both cases.Bareback
S
3

Quote> My question is, what's the rule about the number of vptrs in inheritance?

There are no rulez, every compiler vendor is allowed to implement the semantics of inheritance the way he sees fit.

class B: public A {}, size = 12. That's pretty normal, one vtable for B that has both virtual methods, vtable pointer + 2*int = 12

class C : public A, public B {}, size = 20. C can arbitrarily extend the vtable of either A or B. 2*vtable pointer + 3*int = 20

Virtual inheritance: that's where you really hit the edges of undocumented behavior. For example, in MSVC the #pragma vtordisp and /vd compile options become relevant. There's some background info in this article. I studied this a few times and decided the compile option acronym was representative for what could happen to my code if I ever used it.

Scarper answered 10/1, 2010 at 22:12 Comment(0)
R
2

All of this is completely implementation defined you realize. You can't count on any of it. There is no 'rule'.

In the inheritance example, here is how the virtual table for classes A and B might look:

      class A
+-----------------+
| pointer to A::v |
+-----------------+

      class B
+-----------------+
| pointer to A::v |
+-----------------+
| pointer to B::w |
+-----------------+

As you can see, if you have a pointer to class B's virtual table, it is also perfectly valid as class A's virtual table.

In your class C example, if you think about it, there is no way to make a virtual table that is both valid as a table for class C, class A, and class B. So the compiler makes two. One virtual table is valid for class A and C (mostly likely) and the other is valid for class A and B.

Rodina answered 10/1, 2010 at 21:52 Comment(1)
I wish someone could reference a script from the holy cpp standard. I am trying to access variables from a base class that will be defined in its child classes. Furthermore, if I am referencing cpp code from Assembly I need to know this.Simonton
M
1

This obviously depends on the compiler implementation. Anyway I think that I can sum up the following rules from the implementation given by a classic paper linked below and which gives the number of bytes you get in your examples (except for class D which would be 36 bytes and not 32!!!):

The size of an object of class T is:

  • The size of its fields PLUS the sum of the size of every object from which T inherits PLUS 4 bytes for every object from which T virtually inherits PLUS 4 bytes ONLY IF T needs ANOTHER v-table
  • Pay attention: if a class K is virtually inherited multiple times (at any level) you have to add the size of K only once

So we have to answer another question: When does a class need ANOTHER v-table?

  • A class that does not inherit from other classes needs a v-table only if it has one or more virtual methods
  • OTHERWISE, a class needs another v-table ONLY IF NONE of the classes from which it non virtually inherits does have a v-table

The End of the rules (which I think can be applied to match what Terry Mahaffey has explained in his answer) :)

Anyway my suggestion is to read the following paper by Bjarne Stroustrup (the creator of C++) which explains exactly these things: how many virtual tables are needed with virtual or non virtual inheritance... and why!

It's really a good reading: http://www.hpc.unimelb.edu.au/nec/g1af05e/chap5.html

Maracanda answered 10/1, 2010 at 22:32 Comment(0)
I
0

I am not sure but I think that it is because of pointer to Virtual method table

Iz answered 10/1, 2010 at 21:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.