Why does virtual keyword increase the size of derived a class?
Asked Answered
S

6

4

I have two classes - one base class and one derived from it :

class base {

 int i ;

  public :
  virtual ~ base () { }
};

class derived :  virtual public base { int j ; };

main()

{ cout << sizeof ( derived ) ; }

Here the answer is 16. But if I do instead a non-virtual public inheritance or make the base class non-polymorphic , then I get the answer as 12, i.e. if I do :

class base {

 int i ;

 public :
virtual ~ base () { }
};

class derived :  public base { int j ; };

main()

{ cout << sizeof ( derived ) ; }

OR

class base {

int i ;

public :
~ base () { }
};

class derived :  virtual public base { int j ; };

main()

{ cout << sizeof ( derived ) ; }

In both the cases answer is 12.

Can someone please explain why there is a difference in the size of the derived class in 1st and the other 2 cases ?

( I work on code::blocks 10.05, if someone really need this )

Saffier answered 5/6, 2012 at 19:24 Comment(13)
Please format your code.Neva
Put four spaces before each line of code to form a "code block". Add four (or two) more spaces for each indent so that your code is properly formatted. This will make your code much easier to read.Porpoise
You can also select the code with the mouse and then click the "{ }" icon in the formatting bar. This will form a code block for you.Porpoise
sorry for the flaws. I think I've corrected them now.Saffier
"Can someone please explain whats going on ?" Uhm, the size of one is 16 and the size of the other is 12. Can you be more specific in your question?Peerage
to be specific, I'm asking why is this difference in the size of the derived class in these three cases ?Saffier
the size is increased by the necessity of the virtual function table (pointer) that must be included in order to get runtime polymorhpism.Plummer
@Chad: No, base already has a vptr, and the questioner clearly understands that, and wants to know why derived has another hidden pointer, on top of that, and only if it inherits virtually.Myelencephalon
@cirronimbo: Do you understand what virtual inheritance is for, and how it works? If not, go learn that first. If so, if you tell us which compiler you use, maybe we can explain exactly how it implements virtual inheritance.Myelencephalon
@abarnert- Exactly. this is what m askin. Only you seem to understand this here. I'm using GNU GCC Compiler on Code::Blocks 10.05.Saffier
Also, if I remove the data members i and j from the base and derived classes resp. i.e. let the sizes be determined solely by the VPTRs, then this "hidden pointer" no longer seems to exist and everything works quite as expected and output is 4 in all 3 cases.Saffier
@cirronimbo: If you read my answer and Timo's (better) answer below—or, even better, google for "Inside the C++ Object Model" by Stanley Lippman—you'll understand why most platforms don't need the extra hidden pointer if there are no data members.Myelencephalon
@cirronimbo: PS, just "GNU GCC" doesn't help. We need to know the gcc version, and the target platform—both CPU and OS (and, for Windows, whether it's native/MinGW or Cygwin).Myelencephalon
M
2

The point of virtual inheritance is to allow sharing of base classes. Here's the problem:

struct base { int member; virtual void method() {} };
struct derived0 : base { int d0; };
struct derived1 : base { int d1; };
struct join : derived0, derived1 {};
join j;
j.method();
j.member;
(base *)j;
dynamic_cast<base *>(j);

The last 4 lines are all ambiguous. You have to explicitly whether you want the base inside the derived0, or the base inside derived1.

If you change the second and third line as follows, the problem goes away:

struct derived0 : virtual base { int d0; };
struct derived1 : virtual base { int d1; };

Your j object now only has one copy of base, not two, so the last 4 lines stop being ambiguous.

But think about how that has to be implemented. Normally, in a derived0, the d0 comes right after the m, and in a derived1, the d1 comes right after the m. But with virtual inheritance, they both share the same m, so you can't have both d0 and d1 come right after it. So you're going to need some form of extra indirection. That's where the extra pointer comes from.

If you want to know exactly what the layout is, it depends on your target platform and compiler. Just "gcc" isn't enough. But for many modern non-Windows targets, the answer is defined by the Itanium C++ ABI, which is documented at http://mentorembedded.github.com/cxx-abi/abi.html#vtable.

Myelencephalon answered 5/6, 2012 at 21:29 Comment(1)
thanks abarnert. I think I need not tell ya my system specifications now :)Saffier
M
3

There are two separate things here that cause extra overhead.

Firstly, having virtual functions in the base class increases its size by a pointer size (4 bytes in this case), because it needs to store the pointer to the virtual method table:

normal inheritance with virtual functions:

0        4       8       12
|      base      |
| vfptr  |  i    |   j   |

Secondly, in virtual inheritance extra information is needed in derived to be able to locate base. In normal inheritance the offset between derived and base is a compile time constant (0 for single inheritance). In virtual inheritance the offset can depend on the runtime type and actual type hierarchy of the object. Implementations may vary, but for example Visual C++ does it something like this:

virtual inheritance with virtual functions:

0        4         8        12        16
                   |      base        |
|  xxx   |   j     |  vfptr |    i    |

Where xxx is a pointer to some type information record, that allows to determine the offset to base.

And of course it's possible to have virtual inheritance without virtual functions:

virtual inheritance without virtual functions:

0        4         8        12
                   |  base  |
|  xxx   |   j     |   i    |
Madonia answered 5/6, 2012 at 21:17 Comment(4)
Since he specifically asked about gcc, it might be better to find out the target platform and draw the details about that platform instead of for MSVC, but otherwise this is a great answer—very simple and clear.Myelencephalon
In that case if ( in virtual inheritance with virtual functions ) I remove the data members i and j from the base and derived classes resp., then the size of the derived class should be 8 ( xxx + vfptr ), BUT its coming out to be 4 only ??Saffier
Well, the details will depend on your compiler and platform. But notice that if you've removed the data members, there's no need to distinguish between the derived0 part and the derived1 part—in both cases, the "data" (of which there is none) comes right after the base vptr, so you don't need both pointers.Myelencephalon
@abaenert: I posted it before reading your post. Got the idea now .Saffier
C
3

If a class has any virtual function, objects of this class need to have a vptr, that is a pointer to the vtable, that is the virtual table from where the address of the correct virtual function can be found. The function called depends on the dynamic type of the object, that it is the most derived class the object is a base subobject of.

Because the derived class inherits virtually from a base class, the location of the base class relative to the derived class is not fixed, it depends on the dynamic type of the object too. With gcc a class with virtual base classes needs a vptr to locate the base classes (even if there is no virtual function).

Also, the base class contains a data member, which is located just after the base class vptr. Base class memory layout is: { vptr, int }

If a base class needs vptr, a class derived from it will need a vptr too, but often the "first" vptr of a base class subobject is reused (this base class with the reused vptr is called the primary base). However this is not possible in this case, because the derived class needs a vptr not only to determine how to call the virtual function, but also where the virtual base is. The derived class cannot locate its virtual base class without using the vptr; if the virtual base class was used as a primary base, the derived class would need to locate its primary base to read the vptr, and would need to read the vptr to locate its primary base.

So the derived cannot have a primary base, and it introduces its own vptr.

The layout of a base class subobject of type derived is thus: { vptr, int } with the vptr pointing to a vtable for derived, containing not only the address of virtual functions, but also the relative location of all its virtual base classes (here just base), represented as an offset.

The layout of a complete object of type derived is: { base class subobject of type derived, base }

So the minimum possible size of derived is (2 int + 2 vptr) or 4 words on common ptr = int = word architectures, or 16 bytes in this case. (And Visual C++ makes bigger objects (when virtual base classes are involved), I believe a derived would have one more pointer.)

So yes, virtual functions have a cost, and virtual inheritance has a cost. The memory cost of virtual inheritance in this case is one more pointer per object.

In designs with many virtual base classes, the memory cost per object might be proportional to the number of virtual base classes, or not; we would need to discuss specific class hierarchies to estimate the cost.

In designs without multiple inheritance or virtual base classes (or even virtual functions), you might have to emulate many things automatically done by the compiler for you, with a bunch of pointers, possibly pointers to functions, possibly offsets... this could get confusing and error prone.

Circumgyration answered 5/8, 2012 at 7:36 Comment(2)
While writing objects to a file, I directly write the entire object to file using write(). So virtual adds 4 more bytes (or 8 more bytes depending on the architecture) while writing. Is there any way we can avoid this?Curator
Writing the bytes of the representation to files causes load of issues. How do you want to read back the object?Circumgyration
M
2

What's going on is the extra overhead used to mark a class as having virtual members or involving virtual inheritance. How much extra depends on the compiler.

A mark of caution: Making a class derive from a class for which the destructor is not virtual is usually asking for trouble. Big trouble.

Misinterpret answered 5/6, 2012 at 19:34 Comment(0)
G
2

Possibly extra 4 bytes are needed to mark class type at runtime. For example:

class A {
 virtual int f() { return 2; }
}

class B : virtual public A {
 virtual int f() { return 3; }
}

int call_function( A *a) {
   // here we don't know what a really is (A or B)
   // because of this to call correct method
   // we need some runtime knowledge of type and storage space to put it in (extra 4 bytes).
   return a->f();
}

int main() {
   B b;
   A *a = (A*)&b;

   cout << call_function(a);
}
Goddamn answered 5/6, 2012 at 19:47 Comment(4)
My system shows the size of this VPTR i.e. void* type pointer as 4 bytes(assuming the 4 bytes you're referring to is the VPTR) and size of int also 4 bytes. So in total , in my 1st case, size of the derived class should be 12 only. But its coming out to be 16. So where from these 4 extra bytes are coming, that is what m askin.Saffier
Seems like VPTR is added twice - for base and for derived.Goddamn
Exactly. Thats where I'm stuck at because I haven't read anything like that before !Saffier
This does not quite explain why the size is 16!Circumgyration
M
2

The point of virtual inheritance is to allow sharing of base classes. Here's the problem:

struct base { int member; virtual void method() {} };
struct derived0 : base { int d0; };
struct derived1 : base { int d1; };
struct join : derived0, derived1 {};
join j;
j.method();
j.member;
(base *)j;
dynamic_cast<base *>(j);

The last 4 lines are all ambiguous. You have to explicitly whether you want the base inside the derived0, or the base inside derived1.

If you change the second and third line as follows, the problem goes away:

struct derived0 : virtual base { int d0; };
struct derived1 : virtual base { int d1; };

Your j object now only has one copy of base, not two, so the last 4 lines stop being ambiguous.

But think about how that has to be implemented. Normally, in a derived0, the d0 comes right after the m, and in a derived1, the d1 comes right after the m. But with virtual inheritance, they both share the same m, so you can't have both d0 and d1 come right after it. So you're going to need some form of extra indirection. That's where the extra pointer comes from.

If you want to know exactly what the layout is, it depends on your target platform and compiler. Just "gcc" isn't enough. But for many modern non-Windows targets, the answer is defined by the Itanium C++ ABI, which is documented at http://mentorembedded.github.com/cxx-abi/abi.html#vtable.

Myelencephalon answered 5/6, 2012 at 21:29 Comment(1)
thanks abarnert. I think I need not tell ya my system specifications now :)Saffier
D
0

The extra size is due to the vtable/vtable pointer that is "invisibly" added to your class in order to hold the member function pointer for a specific object of this class or it's descendant/ancestor.

If that isn't clear, you'll need to do much more reading about virtual inheritance in C++.

Deoxyribose answered 5/6, 2012 at 20:17 Comment(1)
He clearly understands that there's a 4-byte VPTR. He's asking why B has an extra 4 bytes on top of the 4 that both have. And that's because of the way virtual inheritance works, which your answer doesn't address.Myelencephalon

© 2022 - 2024 — McMap. All rights reserved.