How can I determine if a compiler uses early or late binding on a virtual function?

Asked 30/9, 2011 at 13:34 Answered 30/9, 2011 at 14:34

Solved c++compiler-construction late-binding early-binding

I have the following code:

class Pet {
public:
  virtual string speak() const { return ""; }
};

class Dog : public Pet {
public:
  string speak() const { return "Bark!"; }
};

int main() {
  Dog ralph;
  Pet* p1 = &ralph;
  Pet& p2 = ralph;
  Pet p3;

  // Late binding for both:
  cout << "p1->speak() = " << p1->speak() <<endl;
  cout << "p2.speak() = " << p2.speak() << endl;

  // Early binding (probably):
  cout << "p3.speak() = " << p3.speak() << endl;
}

I have been asked to determine whether the compiler uses early or late binding for the final function call. I have searched online but have found nothing to help me. Can someone tell me how I would carry out this task?

Pennate answered 30/9, 2011 at 13:34 Comment(5)

Btw. what exact definition of late and early binding are you using? probably not those from wikipedia (en.wikipedia.org/wiki/Early_binding)? – Urbanity 30/9, 2011 at 13:47

@PlasmaHH: hmm. the link says early binding, but the page's title is late binding ;) – Pusey 30/9, 2011 at 13:59

@BlackBear: Its wikipedia. Note the " (Redirected from Early binding)". The ponit is that the OP probably mean dynamic vs. static dispatch. – Urbanity 30/9, 2011 at 14:0

Im not sure what you mean? Basically my understanding is early binding is done at compile time, late binding is at runtime. – Pennate 30/9, 2011 at 14:2

There's an intermediate case: speculative binding. Make a direct call to the predicted function, check the dynamic type, and if necessary (mispredicted) do a virtual call. – Farly 30/9, 2011 at 14:19

You can look at the disassembly, to see whether it appears to be redirecting through a vtable.

The clue is whether it calls directly to the address of the function (early binding) or calls a computed address (late binding). The other possibility is that the function is inlined, which you can consider to be early binding.

Of course the standard doesn't dictate the implementation details, there may be other possibilities, but that covers "normal" implementations.

Mouser answered 30/9, 2011 at 13:42 Comment(3)

I have produced tha assembly code for the program however im not sure what the v-table creation looks like, do vtables have a typical name to look out for? – Pennate 30/9, 2011 at 13:58

If you don't see a vtable, you might see a __vptr pointer instead. – Farly 30/9, 2011 at 14:1

@Bap: vtable creation has nothing to do with it. I'm saying look at the code for the call, and see whether it seems to be calling a constant-fixup address, or loading some value out of a location that depends somehow on the contents of the object (especially the contents of the first sizeof(void*) bytes of the object, which is where the virtual pointer usually lives). As MSalters says, you might catch sight of a name of some internal working of the compiler, perhaps __vptr, but that depends on the implementation and on the disassembler. – Mouser 30/9, 2011 at 14:22

You can always use hack :D

//...
Pet p3;
memset(&p3, 0, sizeof(p3));
//...

If compiler does use vtbl pointer, guess what will gonna happen :>

p3.speak()  // here

Oden answered 30/9, 2011 at 14:34 Comment(0)

Look at the generated code. E.g. in Visual Studio you can set a breakpoint, then right-click and select "Go To Disassembly".

Zymotic answered 30/9, 2011 at 13:39 Comment(1)

true for the general case of the question – Gleda 30/9, 2011 at 13:41

It uses early binding. You have an object of type P3. While it is a base class with a virtual function definition, the type is concrete and known at compile time, so it doesn't have to consider the virtual function mapping to derived classes.

This is much the same as if you called speak() in the Pet constructor - even when making derived objects, when the base class constructor is executing the type of the object is that of the base so the function would not use the v-table, it would call the base type's version.

Basically, early binding is compile time binding and late binding is run-time binding. Run time binding is only used in instances where the compiler doesn't have enough type information at compile time to resolve the call.

Upbow answered 30/9, 2011 at 13:39 Comment(16)

That is just a possible optimization, the compiler is not forced to do it that way, as such the only way to be sure is to look in the assembly (it might even change with compiler seetings) – Urbanity 30/9, 2011 at 13:41

I'm almost positive that in situations where the type is clearly the base type, the language/compiler will always execute the base type function and the v-table will not be considered. The book Effective C++ usually looks at all catch-22's and it didn't present anything when covering that at least :/ Where did you learn it's just an optimization for the compiler? – Upbow 30/9, 2011 at 13:43

This is true for the p3.speak() call, but it cannot be said in general for the other calls without inspecting the generated code, even though the virtual call is very likely to be optimized away in both cases. – Carson 30/9, 2011 at 13:44

I'd agree with Nicola's comment :) the other cases are definitely a lot less clear than p3.speak() in this regard. I wouldn't be sure what happened in them - though I can take a pretty good guess. – Upbow 30/9, 2011 at 13:45

@PlasmaHH: In the last case the compiler must perform static binding, because the call is performed on a class instance, rather than a pointer or a reference. – Carson 30/9, 2011 at 13:47

There's no difference in the "observable behavior" of the program whether it uses the virtual or non-virtual mechanism, so it's safely in the realm of "up to the implementation to decide how to get it done". Since in this case you expect the early binding to be more efficient, that makes it an optimization issue. You're certainly right that in the case of p3 there's no obvious reason to use the virtual mechanism. – Mouser 30/9, 2011 at 13:48

This is an excerpt from the book im using: The compiler knows the exact type and that it’s an object, so it can’t possibly be an object derived from Pet – it’s exactly a Pet. Thus,early binding is probably used. However, if the compiler doesn’t want to work so hard, it can still use late binding and the same behavior will occur. This would lead me to believe that late binding is more efficient, is this wrong? – Pennate 30/9, 2011 at 13:48

@NicolaMusatti: Where does the standard state that? on assembler level the compiler handles things via pointers anyways, since it has at least to pass this as a pointer somehow (often in a register). – Urbanity 30/9, 2011 at 13:49

@Nicola: be careful - logically, it's a non-virtual call since the dynamic type of p3 is certainly the same as its static type, whereas logically p1 and p3 are virtual calls since the dynamic type is different from the static type. However there's a difference between which type of call it is, vs what call mechanism the implementation actually uses. It's permitted to optimise the virtual calls since data flow analysis can tell the dynamic type, and it's permitted to "pessimise" the non-virtual call since a compiler is permitted by the standard to waste time wherever it likes. – Mouser 30/9, 2011 at 13:52

OK, I'll rephrase: only a very silly implementation would use late binding for a non-virtual call. At the assembler level early binding translates to a direct load of the function address, while late binding requires calculating where the function address is stored and loading it from that location. – Carson 30/9, 2011 at 13:57

@BapJohnston Late binding is less efficient by a long shot. In late binding instead of having a direct call, the v-table has to be navigated to find the correct type's variation of the function to call. It's an extra level of indirection (at minimum). Also, the compiler may make interesting optimizations with early binding since it knows all the information ahead of time - it can't do much with late binding. – Upbow 30/9, 2011 at 13:58

No, late binding is less efficient than early binding, as I explained in my previous comment. – Carson 30/9, 2011 at 14:3

@SteveJessop: The "as if" rule is to be taken "cum grano salis" even if the standard doesn't explicitly state so. An implementation that worked as you suggest would have faded into oblivion a long, long time ago. – Carson 30/9, 2011 at 14:6

@Bap: you're saying that late binding is "less efficient", meaning that the compiler has to do less work. Whether it's actually true that the compiler has to do extra work in order to make a non-virtual call here is debateable, but even if it is true, when we talk about "efficiency" we generally mean the efficiency of the code emitted, that is the run time cost. The time it takes to compile the code is a separate thing, and in C++ you're usually happy for the compiler to take some time to produce faster or smaller emitted code. – Mouser 30/9, 2011 at 14:12

@Nicola: sure, but the question is how to find out what the compiler did, whereas w00te's answer and your comments are talking about what the compiler would do, if it's any good (and using the word "must" to describe that state of affairs). Predicting what the compiler will do is completely separate from confirming whether the prediction is correct. – Mouser 30/9, 2011 at 14:15

The question sounded more like an interview question where they wanted a quick answer on what would happen to me, but I'll concede that if you were going into the detailed state of things it probably is a compiler decision. Honestly, I don't know if the standard addresses this or not. – Upbow 30/9, 2011 at 14:28

In fact the compiler has no obligation to use either one particularly, just to make sure that the right function is called. In this case, your object is of the concrete type Pet, so as long as Pet::speak is called the compiler is "doing the right thing".

Now, given that the compiler can statically see the type of the object, I suspect that most compilers will optimize away the virtual call but there is no requirement that they do so.

If you want to know what your particular compiler is doing the only way is to consult its documentation, source code, or the generated disassembly.

Munificent answered 30/9, 2011 at 13:57 Comment(1)

I have looked at the assembled code, im just cant see anything resembling vtable creation, I think I need to look up the documentation to see what to see what i should be looking for. – Pennate 30/9, 2011 at 14:9

I just thought of a way to tell at runtime, without guesswork. You can simply override the vptr of your polymorphic classes with 0 and see if the method is called or if you get a segmentation fault. This is what I get for my example:

Concrete: Base
Concrete: Derived
Pointer: Base
Pointer: Derived
DELETING VPTR!
Concrete: Base
Concrete: Derived
Segmentation fault

Where Concrete: T means that calling the virtual member function of T through a concrete type was successful. Analogously, Pointer: T says that calling the member function of T through a Base pointer was successful.

For reference, this is my test program:

#include <iostream>
#include <string.h>

struct Base {
  unsigned x;
  Base() : x(0xEFBEADDEu) {
  }
  virtual void foo() const {
    std::cout << "Base" << std::endl;
  }
};

struct Derived : Base {
  unsigned y;
  Derived() : Base(), y(0xEFCDAB89u) {
  }
  void foo() const {
    std::cout << "Derived" << std::endl;
  }
};

template <typename T>
void dump(T* p) {
  for (unsigned i = 0; i < sizeof(T); i++) {
    std::cout << std::hex << (unsigned)(reinterpret_cast<unsigned char*>(p)[i]);
  }
  std::cout << std::endl;
}

void callfoo(Base* b) {
  b->foo();
}

int main() {
  Base b;
  Derived d;
  dump(&b);
  dump(&d);
  std::cout << "Concrete: ";
  b.foo();
  std::cout << "Concrete: ";
  d.foo();
  std::cout << "Pointer: ";
  callfoo(&b);
  std::cout << "Pointer: ";
  callfoo(&d);
  std::cout << "DELETING VPTR!" << std::endl;
  memset(&b,0,6);
  memset(&d,0,6);
  std::cout << "Concrete: ";
  b.foo();
  std::cout << "Concrete: ";
  d.foo();
  std::cout << "Pointer: ";
  callfoo(&b);
  std::cout << "Pointer: ";
  callfoo(&d);
  return 0;
}

Fowl answered 30/9, 2011 at 14:34 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags