c++ virtual keyword vs overriding function
Asked Answered
G

6

11

I am learning c++ and am learning about the virtual keyword. I have scoured the internet trying to understand it to no avail. I went into my editor and did the following experiment, expecting it to print out the base message twice (because I was under the impression that the virtual keyword is needed to override functions). However, it printed out two different messages. Can someone explain to me why we need the virtual keyword if we can simply override functions and still seemingly get polymorphic behavior? Perhaps someone can help me and other people in the future understand virtual vs. overriding. (The output I am getting is "I am the base" followed by "I am the derived").

#include <iostream>

using namespace std;
class Base{
public:
    void printMe(){
        cout << "I am the base" << endl;
    }
};
class Derived: public Base{
public:
    void printMe(){
        cout << "I am the derived" << endl;
    }
};
int main() {
    Base a;
    Derived b;
    a.printMe();
    b.printMe();
    return 0;
}
Grot answered 27/7, 2017 at 17:45 Comment(5)
Note: using namespace std; is a bad habit to get into and if you can stop now you might avoid a whole lot of headaches in the future. The std:: prefix is there for a reason: It avoids conflict with your own classes, structures and variables.Seeing
Try Base *p = new Derived; p->printMe(); with and without the virtual.Perice
To clarify - polymorphic behavior is achieved when accessing an object through a pointer or a reference to its base class.Sensorium
Just a tiny nitpick: As a general rule, you should include the output of your test program in your question.Holbein
@JesperJuhl made the suggested changes. thank youGrot
G
15

Consider the following example. The important line to illustrate the need for virtual and override is c->printMe();. Note that the type of c is Base*, however due to polymorphism it is correctly able to call the overridden method from the derived class. The override keyword allows the compiler to enforce that a derived class method matches the signature of a base class's method that is marked virtual. If the override keyword is added to a derived class function, that function does not also need the virtual keyword in the derived class as the virtual is implied.

#include <iostream>

class Base{
public:
    virtual void printMe(){
        std::cout << "I am the base" << std::endl;
    }
};

class Derived: public Base{
public:
    void printMe() override {
        std::cout << "I am the derived" << std::endl;
    }
};

int main() {
    Base a;
    Derived b;
    a.printMe();
    b.printMe();
    Base* c = &b;
    c->printMe();
    return 0;
}

The output is

I am the base
I am the derived
I am the derived
Germaun answered 27/7, 2017 at 17:49 Comment(0)
M
6

With the code you have, if you do this

Derived derived;
Base* base_ptr = &derived;
base_ptr->printMe();

What do you think happens? It will not print out I am the derived because the method is not virtual, and the dispatch is done off the static type of the calling object (i.e. Base). If you change it to virtual the method that is called will depend on the dynamic type of the object and not the static type.

Magel answered 27/7, 2017 at 17:49 Comment(0)
S
5

override is a new keyword added in C++11.

You should use it because:

  • the compiler will check if a base class contains a matching virtual method. This is important since some typo in the method name or in its list of arguments (overloads are allowed) can lead to the impression that something was overridden when it really was not.

  • if you use override for one method, the compiler will report an error if another method is overridden without using the override keyword. This helps detect unwanted overrides when symbol collisions happen.

  • virtual doesn't mean "override". In class doent use "override" keyword than to override a method you can simply write this method omitting "virtual" keyword, override will happen implicitly. Developers were writing virtual before C++11 to indicate their intention of override. Simply put virtual means: this method can be overridden in a subclasses.

Sixtyfour answered 27/7, 2017 at 18:1 Comment(2)
"virtual doesn't mean "override". If you omit it, overriding will still work."? I presume you mean if you omit it in the derived classes not if you omit in the baseSenter
yes this is what I meant. I've improved text to be more clear.Sixtyfour
Q
3

virtual means, "This is NOT REALLY a C function, i.e a series of pushes of arguments onto the stack, followed by a jump to a SINGLE unchanging address of the function body."

Instead, it's this other beast that looks in a table at runtime for the address of the function body to execute. Each class in the hierarchy has an entry in that table. The table of function pointers is called a vtable. This is a RUNTIME mechanism for polymorphism that injects extra code to do this lookup and then dispatch to the appropriate specialized version of the function body.

Furthermore, when using this vtable dispatch mechanism, you always access your object through a POINTER to the object, as opposed to direct access (variable or reference) to it, ie. Foo* foo{makeFoo()}; foo->someMethod() vs. Loo loo{}; loo.someMethod(). So another dereference right from the get go is required to use this technique.

Here's the neat part: these pointers can point to any objects of derived classes as well, so if you have a class FooChild that inherits from FooParent, you can use a FoodParent * to point to a FooParent OR a FooChild.

When the call is made to the method, instead of just doing the normal C thing of preparing the arguments on the stack, then jumping to the body of barMethod(), it does a bunch of runtime work first to look up one of SEVERAL DIFFERENT implementations of barMethod that are individualized per class. That table is called the vtable. Each class in the class hierarchy has an entry in this table that says where the function body REALLY is for that particular class, since they can have different ones, EVEN IF we are using FooParent * to point to instances of any of them.

But here's why we would want to do that in the first place: suppose virtual does not exist. And you, the programmer, want to handle a bunch of objects that come from a class hierarchy. Well, you'd end up pretty much coding the same thing that the compiler injects for you by hand! In order to pass in your instances of these various classes into some function that you write to do stuff with them, you need a singularly sized type for the function call code to work. So, use pointers because pointers are always the same size on your machine (these days), no matter how differently sized the objects they point to are. Okay. So pointers it is. That's a sort of type erasure that is required to use virtual.

Then you need a switch statement or something to branch on the particular class it turns out to point to. But that'd be if you coded it by hand for each variation you wrote. That's silly. So quickly you'd realize you'd be better off with a table of pointers to your various versions of barMethod() to call. Then you could always just look up that same table from every variation, instead of rewriting handcoded switch statements and such. So you'd do that. You'd implement a table in which you have pointers to different barMethod()s for each of the classes in the hierarchy deriving from FooParent. They'd all have the SAME SIGNATURE (parameter list, return value, etc), but DIFFERENT BODIES, for each class.

You'd assign each class an integer i.d. or something like that and use that as the offset into the table. Maybe FooChildA and FooChildB are two different classes that both derive from FooParent for example, so you'd assign A to 0 and B to 1, or something like that. Then use those as offsets to jump into the table and get your pointer. That's how look up tables work in general. Once you got your pointer, you'd push all the arguments onto the stack, and then jump to that pointer. So virtual is just a keyword that instructs the compiler to inject all this crazy high-level code into your code for you so you don't have to manually do it.

The problem is, it's RUNTIME polymorphism, when usually COMPILE time polymorphism can be used instead, via templates etc. It adds a lot of runtime bloat to every single function call in the virtual hierarchy. That's actually just fine for non-hot loops. But for things that run all the time in your system (like every few milliseconds or more) that's really an unacceptable amount of bloat. For the vast majority of cases, you could do the equivalent of all that table lookup stuff at compile time instead using metaprogramming so that runtime can be blazingly fast.

As for override, that confusing mess should have been in the language from the get-go and should be in the same textual position as the virtual keyword. Sadly, both of those "shoulds" were not done. So in the old days, you'd declare barMethod() in the most parent of the class hierarchy as virtual, and then also declare barMethod() in the derived classes as virtual. At some point this got to be super annoying due to weird bugs. The feature honestly isn't intuitive and is hard to teach or even remember after YEARS of knowing about it.

So we added override as well as a hint to the compiler so we can catch bugs. It just means "not only is this function virtual, so do all that crazy vtable dispatching stuff, but in addition, this is a DERIVED re-definition of barMethod(), so the compiler can check to make sure you matched the parameters etc perfectly with the parent class from which it was derived, because without this check, if you accidentally failed to match the derived version's parameter list exactly with the parent's version, instead of overriding the parent version, the compiler would just say, "Oh, another totally new virtual member function hierarchy is starting, with different parameters, and this is the root. Must be a new overload set."

I realize that's a super confusing statement. But basically, if you have barMethod() and barMethod(int) and barMethod(int, char*) and so forth, these are all DIFFERENT functions with no real relationship to each other. It's as if each had a different name. You can think of it that way in your head. It's essentially how the compiler itself thinks of it, with name mangling. So if you then made them virtual, you might think that declaring them in various classes in the hierarchy would put them into a single member function virtual hierarchy as well. But it doesn't. If you make them virtual using override keyword instead, the compiler would notice that barMethod(int) override and barMethod(int, char*) override have no relationship to anything in FooParent, which only has barMethod() with no parameters. But they are supposedly overriding something. ¡COMPILER ERROR! And that's good. You want that compiler error, or else you code goes out to customers and looks like it's working but absolutely isn't.

The point of virtual is to allow you to use a SINGLE POINTER TYPE to represent any instances of an entire hierarchy of classes, but do different things for each of them, potentially. That wouldn't happen if the programmer didn't make sure ALL of the derived redefinitions are also virtual. And override makes sure they aren't accidentally creating new class hierachy roots.

In modern C++, we have decided it was too annoying to require both virtual and override, and that it always made it harder to visually grep which barMethod()s were the root version, and which ones were derived. And so they said, "you can drop the virtual keyword for the derived redefinitions and JUST use override." This is considered the only proper way to speak nowadays.


struct FooParent
{
    // The root has virtual
    virtual void barMethod(){ /* body */ } // or `=0` for "pure virtual"
}

// Original way of doing it. Just use virtual again, but this isn't the root now. This is a derived class.
struct FooChild_OldSchool : FooParent
{
    virtual void barMethod(); // Total trashmouth. Bug prone.
}

struct FooChild_OverrideDays : FooParent
{
   virtual void barMethod() override; // Naughty mouth. Using both.
}

struct FooChild_NonTrashyWay2020 : FooParent
{
  void barMethod() override; // Prim and proper mouth. Using only override in the derived class.
}

Bizarrely though, override sits in a different location syntactically, AFTER the parameter list, instead of before it. As far as I can tell this is really illogical. I really wish that we would fix this and allow override to go in the same place virtual does, at the beginning of the declaration, or better yet, let virtual go where override does, after the parameter list. As it is now, it's annoyingly inconsistent and confusing, imo. And I say all that because I believe these things make it unteachable if we don't admit they are warts. Because when you are learning a new language, you really need a more fluent speaker to say, "hey this is weird and warty. Don't worry about it. It's not because you're dumb. It's just because our language is evolved and wonky."

I wish it was like this...

struct FooChild_HowIWishItWas : FooParent
{
  override void barMethod();
}

// OR EVEN BETTER! Allow us to change the location of virtual!
struct FooParent_HowIWishItWasEvenMore
{
   void barMethod() virtual;
}

But it isn't. That's maybe how you can think of it internally though, and then just remember to add this weird wonkiness syntactically when you're actually typing the code. Wonder whether a paper on this would survive 5 minutes. Hmm.

Quartan answered 7/7, 2020 at 4:23 Comment(1)
regarding vtable dispatch mechanism - "If a derived class is handled using pointer or reference to the base class, a call to an overridden virtual function would invoke the behavior defined in the derived class. " [en.cppreference.com/w/cpp/language/virtual]Subclass
S
2

You're not seeing the behaviour here because you've declared b to be of type Derived so the compiler knows what functions to use. In order to expose why virtual is necessary you need to mix things up:

int main() {
    Base a;
    Base *b = new Derived();

    a.printMe();
    b->printMe();

    delete b;

    return 0;
}

Now b is of type Base* which means it's going to use the functions on Base plus whatever's in the virtual function table. This breaks your implementation. You can fix it by properly declaring things virtual.

Seeing answered 27/7, 2017 at 17:48 Comment(5)
Would you be able to explain your usage of the new keyword on line 3? I am a bit rusty on how new works in this context. I know you're creating a pointer of type base; then are you assign it enough memory to hold a derived class? How does line 3 work?Grot
That's just a straight-up C++ allocation using new. It's a lot like C's malloc but with more intelligence built-in, plus it calls the initializer automatically. A good C++ reference book should cover the basics of new and delete. If you don't have one, the one by the author of C++ is a good place to start.Seeing
@tadman: As I commented on the other answer, I think we're better off not using naked new and delete in example code for newcomers, given that they're generally seen as bad practice in modern C++. A reference (or b = &a) would make the point just as well.Fermat
@TristanBrindle That's a valid point, but not understanding what new does is a big blind-spot in understanding C++ as a whole. That discussion about "how objects are made" will have to come about eventually.Seeing
@tadman: objects are made on the stack or with std::make_unique or std::make_shared (if they need to be on the heap). Using raw new/delete is for background understanding and legacy code.Schear
P
0

I think your question is why would someone use a Base class pointer to call derived class at all in the program.

One such case is when you want to have a common function for all the derived class in your program. You don't want to create same functions with different derived class type argument. See below

#include<iostream>
using namespace std;

class Base{
public:
    virtual void printfunc() { cout<<"this is base class";};
};
class Derived:public Base{
public:
    void printfunc(){cout<<"this is derived class";};
};

void printthis(Base *ptr)
{
    ptr->printfunc();
}

int main()
{
    Derived func;
        printthis(&func);
    return 0;
}
Photomural answered 18/9, 2019 at 7:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.