Why not have all the functions as virtual in C++?
Asked Answered
V

11

56

I know that virtual functions have an overhead of dereferencing to call a method. But I guess with modern architectural speed it is almost negligible.

  1. Is there any particular reason why all functions in C++ are not virtual as in Java?
  2. From my knowledge, defining a function virtual in a base class is sufficient/necessary. Now when I write a parent class, I might not know which methods would get over-ridden. So does that mean that while writing a child class someone would have to edit the parent class. This sounds like inconvenient and sometimes not possible?

Update:
Summarizing from Jon Skeet's answer below:

It's a trade-off between explicitly making someone realize that they are inheriting functionality [which has potential risks in themselves [(check Jon's response)] [and potential small performance gains] with a trade-off for less flexibility, more code changes, and a steeper learning curve.

Other reasons from different answers:

Virtual functions cannot be in-lined because inlining have to happen at runtime. This have performance impacts when you expect you functions benefits from inlining.

There might be potentially other reasons, and I would love to know and summarize them.

Voroshilovgrad answered 7/7, 2011 at 6:24 Comment(3)
It is also possible to inline functions which are not virtual, which allows for lots of compiler optimizations that wouldn't be available in cases where the function is defined as virtual.Yocum
Hi Thoman, Can you explain why it wont be possible to inline virtual functions ?. Is it limitation of avail compilers or there is a theoretical blocker . How does JVM optimize it ?Voroshilovgrad
@Voroshilovgrad In virtual functions the decision about what method to call is made at runtime. With inline functions the method's body is compiled into the caller, a decision that has to be made at compile time.Petula
C
78

There are good reasons for controlling which methods are virtual beyond performance. While I don't actually make most of my methods final in Java, I probably should... unless a method is designed to be overridden, it probably shouldn't be virtual IMO.

Designing for inheritance can be tricky - in particular it means you need to document far more about what might call it and what it might call. Imagine if you have two virtual methods, and one calls the other - that must be documented, otherwise someone could override the "called" method with an implementation which calls the "calling" method, unwittingly creating a stack overflow (or infinite loop if there's tail call optimization). At that point you've then got less flexibility in your implementation - you can't switch it round at a later date.

Note that C# is a similar language to Java in various ways, but chose to make methods non-virtual by default. Some other people aren't keen on this, but I certainly welcome it - and I'd actually prefer that classes were uninheritable by default too.

Basically, it comes down to this advice from Josh Bloch: design for inheritance or prohibit it.

Crowned answered 7/7, 2011 at 6:29 Comment(20)
I haven't had much experience with utilizing virtual functions, but couldn't you cause the same situation utilizing only one virtual function and a different, non-virtual function in a subclass, as long as it's an object of the subclass itself, without any casting to the superclass, utilizing the non-virtual function? I suppose my knowledge of the specifics of virtualization is still pretty vague.Kerr
But then C++ lets you override virtual function and have them called (sometimes, depending). Its that decision which seems problematic to me.Favoritism
@Winston: Which decision? Yes, you can override virtual functions - that's the whole point of them being virtual.Crowned
+1 the principle to follow for everything is: Always make the common case the default. The designers of Java (very reasonably) thought that virtual-methods should be the default; however, that turned out to be a mistake (see "Effective Java" for more info). C# made the correct design from a design standpoint; however, non-virtual-by-default methods make unit testing an extreme pain because classes can no longer be mocked, like in Java. C# needs facilities to make unit testing reasonable (mocking classes, access to private methods for testing classes, etc.)Dyke
@BlueRaja: I don't find unit testing to be an extreme pain, so long as there are appropriate interfaces set up for dependencies. It would be nice if there were a simpler way to express an interface, admittedly, to avoid the duplication between interface and implementation class.Crowned
@Jon: There's that; and the fact that testing requires every class and method to be tested to be public (making access modifiers pointless); and several other problems. See programmers.stackexchange.com/questions/14856/…Dyke
@BlueRaja: Well, public or internal, given InternalsVisibleTo. That's a significant difference. And only those methods to be tested directly. Some people believe it's better to only test the "naturally public" API; all private implementation details should feed that, after all. That sounds like a nice aim to me, but it's not always pragmatic.Crowned
@Jon Skeet, sorry I meant to refer to the ability to override nonvirtual functions.Favoritism
@Winston: Can you actually override them, or can you just hide them?Crowned
@Jon Skeet, I suppose it might be more correct to say that they are hidden. However, it still seems ill-advised to allow a subclass to "hide" a superclasses methods like that.Favoritism
@Winston: There are times when it can be appropriate... e.g. when declaring that you return a more specific value. That's the typical reason in C#, anyway.Crowned
@Jon Skeet, sure, but since that function's not virtual, whether or not it or the original version will be called depends on whether the compiler knows the type at compile time.Favoritism
@Winston: Yes. Sometimes that can be exactly the point: expose a slightly richer API in a subclass, because the superclass is a bit too abstract. Usually one method would end up delegating to the other, of course. This should be relatively rare, but it's not always a bad idea, is all I'm saying.Crowned
@Jon Skeet, I guess was thinking primarily of functions which have the same signature (or close enough) that they could be virtual overloads. Its far too easy to try to overload a method which isn't virtual. But I see that there could be uses of it.Favoritism
Quick question: Doesn't all the arguments against marking all classes final apply here? Your explicitly hurting anyone from extending the class, which means to add functionality they have to copy the class (bad) or modify the source codeDarice
@jon skeet .. the stackoverflow problem would anyways arise if you do decide to mark 2 methods as virtual .. are you suggesting that methods are not virtual by default because 2 virtual methods basically possess a security gsap[or some other word here] ..and hence in general its better to avoid this issue at all ..and let dev manage things related to virtual function and know how they work internally etc ? .. also how java goes about avoiding this problem ..given them that some big production sys do use java ?Voroshilovgrad
@codeObserver: I'm saying that when you've got virtual methods, you've got to document implementation details, as they will affect anyone creating a subclass. That reduces the flexibility later. In Java people tend to muddle through, but I just prefer the idea of only allowing inheritance when you've thought it through.Crowned
okay .. so its a trade-off between explicitly making someone realize that they are inheriting functionality [which has potential risks in themselves][and potential small performance gains] with a trade of for less flexibility/more code changes/steeper learning curve .. [not judging if its good or bad ..but just calling out that's the difference between the designs ?]Voroshilovgrad
@codeObserver: Yes - basically inheritance is a very powerful tool, which is easy to abuse, just like many powerful tools.Crowned
great thanks .. I am summarizing that in my question and accepting this as one of the valid answers for the desgin. Thanks again for helping me understand this !Voroshilovgrad
P
54
  1. One of the main C++ principles is: you only pay for what you use ("zero overhead principle"). If you don't need the dynamic dispatch mechanism, you shouldn't pay for its overhead.

  2. As the author of the base class, you should decide which methods should be allowed to be overridden. If you're writing both, go ahead and refactor what you need. But it works this way, because there has to be a way for the author of the base class to control its use.

Phenylalanine answered 7/7, 2011 at 6:27 Comment(2)
Correct me if I'm wrong, but I believe your point #2 is not correct. Making methods not-virtual in the base class does NOT prevent them from being overridden, so the author really can't decide which methods should be allowed to be overridden. It only prevents them from being called polymorphically (i.e. via base-class pointer). But they can still be overridden and can still be called when called directly from an instance of the derived class (or pointer to derived class).Williams
@DanielGoldfarb, non-virtual member functions can't be overridden, period. They can, however, be hidden - but that's a different thing. My point is that preventing overriding is yet another aspect of encapsulation. Hiding a member function will not change the behavior of the base class, and will not add dependencies that might limit the ability to change the base class in the future.Phenylalanine
D
32

But I guess with modern architectural speed it is almost negligible.

This assumption is wrong, and, I guess, the main reason for this decision.

Consider the case of inlining. C++’ sort function performs much faster than C’s otherwise similar qsort in some scenarios because it can inline its comparator argument, while C cannot (due to use of function pointers). In extreme cases, this can mean performance differences of as much as 700% (Scott Meyers, Effective STL).

The same would be true for virtual functions. We’ve had similar discussions before; for instance, Is there any reason to use C++ instead of C, Perl, Python, etc?

Demon answered 7/7, 2011 at 6:35 Comment(8)
Yes, basically a virtual function cannot be inlined, nor can its argument passing be optimized.Eclogue
Even this statement is starting to be untrue... gcc, for instance, is capable of inlining through a function pointer. Presumably, this can be extended to a virtual method.Synod
gcc does inline virtual functions, provided that you're calling the function on an object of known dynamic type (because the virtual mechanism doesn't need to be used anyway). If you're sorting a container of Base then the dynamic type is known: if Base* then it isn't and you can start worrying about a performance hammering. With function pointers it's similar - if a call to qsort is inlined, then DFA might prove the value of the function pointer, in which case the call could be inlined, although I've never looked into how successful gcc is at doing that.Cairistiona
@Steve This is correct. I’m even surprised that the advantage of functors over function pointers still seems to hold, since this is apparently such an obvious optimisation. In fact, I suspect that e.g. calls to qsort are rarely inlined. sort, on the other hand is a template so for a given comparator its type is known inside sort even without inlining. I suspect that the same is true for the inlining of virtual function calls on known dynamic types.Demon
@Konrad: qsort is sometimes inhibited from inlining on account of being in a different TU, whereas sort is always available in the TU. It'd be slightly interesting to take implementations that do link-time optimization, and see whether they are any better or worse at inlining qsort and then inlining the comparator, than they are at inlining sort and its comparator (when sort is passed a function pointer rather than a functor object of user-defined type, to keep the comparison fair).Cairistiona
I can't vote this up this enough. For the people who need speed, virtual functions are a real problem, and have been measured to be so. For the other 98% of programmers, they probably don't need to use C++.Gaud
@Steve: yes, TUs are the key. For most people, the qsort source code won't be available, and link-time cross-object inlining is something I wouldn't expect to work as well as "normal" intra-TU compile-time inlining/optimisation.Hodgepodge
@Tony It should work just as well. But I agree that this is probably not (yet) reality.Demon
C
14

Most answers deal with the overhead of virtual functions, but there are other reasons not to make any function in a class virtual, as the fact that it will change the class from standard-layout to, well, non-standard-layout, and that can be a problem if you need to serialize binary data. That is solved differently in C#, for example, by having structs being a different family of types than classes.

From the design point of view, every public function establishes a contract between your type and the users of the type, and every virtual function (public or not) establishes a different contract with the classes that extend your type. The greater the number of such contracts that you sign the less room for changes that you have. As a matter of fact, there are quite a few people, including some well known writers, that defend that the public interface should never contain virtual functions, as your compromise to your clients might be different from the compromises you require from your extensions. That is, the public interfaces shows what you do for your clients, while the virtual interface shows how others might help you in doing it.

Another effect of virtual functions is that they always get dispatched to the final overrider (unless you explicitly qualify the call), and that means that any function that is needed to maintain your invariants (think the state of the private variables) should not be virtual: if a class extends it, it will have to either make an explicit qualified call back to the parent or else would break the invariants at your level.

This is similar to the example of the infinite loop/stack overflow that @Jon Skeet mentioned, just in a different way: you have to document in each function whether it accesses any private attributes so that extensions will ensure that the function is called at the right time. And that in turn means that you are breaking encapsulation and you have a leaking abstraction: Your internal details are now part of the interface (documentation + requirements on your extensions), and you cannot modify them as you wish.

Then there is performance... there will be an impact in performance, but in most cases that is overrated, and it could be argued that only in the few cases where performance is critical, you would fall back and declare the functions non-virtual. Then again, that might not be simple on a built product, since the two interfaces (public + extensions) are already bound.

Crust answered 7/7, 2011 at 7:47 Comment(0)
E
8

You forget one thing. The overhead is also in memory, that is you add a virtual table and a pointer to that table for each object. Now if you have an object which has significant number of instances expected then it is not negligible. example, million instance equals 4 Mega byte. I agree that for simple application this is not much, but for real time devices such as routers this counts.

Eusebiaeusebio answered 7/7, 2011 at 11:37 Comment(1)
I'm working on embedded devices, some of which have like 2k of RAM. On those you really want to avoid the pointer overhead, also the extra time cost for calling the methods indirectly through an extra pointer, good point!Hermann
H
6

I'm rather late to the party here, so I'll add one thing that I haven't noticed covered in other answers, and summarise quickly...

  • Usability in shared memory: a typical implementation of virtual dispatch has a pointer to a class-specific virtual dispatch table in each object. The addresses in these pointers are specific to the process creating them, which means multi-process systems accessing objects in shared memory can't dispatch using another process's object! That's an unacceptable limitation given shared memory's importance in high-performance multi-process systems.

  • Encapsulation: the ability of a class designer to control the members accessed by client code, ensuring class semantics and invariants are maintained. For example, if you derive from std::string (I may get a few comments for daring to suggest that ;-P) then you can use all the normal insert / erase / append operations and be sure that - provided you don't do anything that's always undefined behaviour for std::string like pass bad position values to functions - the std::string data will be sound. Someone checking or maintaining your code doesn't have to check if you've changed the meaning of those operations. For a class, encapsulation ensures freedom to later modify the implementation without breaking client code. Another perspective on the same statement: client code can use the class any way it likes without being sensitive to the implementation details. If any function can be changed in a derived class, that whole encapsulation mechanism is simply blown away.

    • Hidden dependencies: when you know neither what other functions are dependent on the one you're overriding, nor that the function was designed to be overridden, then you can't reason about the impact of your change. For example, you think "I've always wanted this", and change std::string::operator[]() and at() to consider negative values (after a type-cast to signed) to be offsets backwards from the end of the string. But, perhaps some other function was using at() as a kind of assertion that an index was valid - knowing it'll throw otherwise - before attempting an insertion or deletion... that code might go from throwing in a Standard-specified way to having undefined (but likely lethal) behaviour.
    • Documentation: by making a function virtual, you're documenting that it is an intended point of customisation, and part of the API for client code to use.

  • Inlining - code side & CPU usage: virtual dispatch complicates the compiler's job of working out when to inline function calls, and could therefore provide worse code in terms of both space/bloat and CPU usage.

  • Indirection during calls: even if an out-of-line call is being made either way, there's a small performance cost for virtual dispatch that may be significant when calling trivially simple functions repeatedly in performance critical systems. (You have to read the per-object pointer to the virtual dispatch table, then the virtual dispatch table entry itself - means the VDT pages are consuming cache too.)

  • Memory usage: the per-object pointers to virtual dispatch tables may represent significant wasted memory, especially for arrays of small objects. This means less objects fit in cache, and can have a significant performance impact.

  • Memory layout: it's essential for performance, and highly convenient for interoperability, that C++ can define classes with the exact memory layout of member data specified by network or data standards of various libraries and protocols. That data often comes from outside your C++ program, and may be generated in another language. Such communications and storage protocols won't have "gaps" for pointers to virtual dispatch tables, and as discussed earlier - even if they did, and the compiler somehow let you efficiently inject the correct pointers for your process over incoming data, that would frustrate multi-process access to the data. Crude-but-practical pointer/size based serialisation/deserialisation/comms code would also be made more complicated and potentially slower.

Hodgepodge answered 8/7, 2011 at 1:53 Comment(0)
D
5

Pay per use (in Bjarne Stroustrup words).

Dewie answered 7/7, 2011 at 6:32 Comment(0)
H
3

Seems like this question might have some answers Virtual functions should not be used excessively - Why ?. In my opinion the one thing that stands out is that it just add more complexity in terms of knowing what can be done with inheritance.

Harmonicon answered 7/7, 2011 at 6:31 Comment(0)
E
2

Yes, it's because of performance overhead. Virtual methods are called using virtual tables and indirection.

In Java all methods are virtual and the overhead is also present. But, contrary to C++, the JIT compiler profiles the code during run-time and can in-line those methods which don't use this property. So, JVM knows where it's really needed and where not thus freeing You from making the decision on your own.

Emmerich answered 7/7, 2011 at 6:32 Comment(8)
+1: This is what I was going to say. The JVM can make decisions at runtime a compiler cannot make staticly.Paulita
In fact a JIT can do even better, it can inline a method that does use the property, but do a fast type-check for a type that very commonly occurs. It makes the virtual call if the type-check fails. So the code for obj.foo() ends up looking like a bit like if (obj.getClass() == Class.forName("BaseClass")) { /* inlined code from BaseClass.foo() */ } else { obj.foo(); };. Except that of course the call to getClass is inlined, to just grab a pointer out of the object, and the result of the call to forName is a pointer to a class object, and that value is inlined into the code too.Cairistiona
@Steve Jessop: I do recall an article (was that from an engineer from Azul or someone directly responsible for JVM I can't remember) where the issue was discussed, but I never did see anything with such detailed example. Could You post some reference, where I could read more?Emmerich
@Rekin: sorry, I don't remember the source, that was just from my general knowledge of the kinds of things JITs can do. Whether any current version of any JIT actually does it is another matter...Cairistiona
@Steve Jessop: So, in theory a JIT can do that optimization. But the same check can in theory be inserted by a C++ compiler, especially with PGO. So I'd disagree with "a JIT can do even better".Jemmie
@MSalters: I meant even better than Rekin describes in this answer, not even better than a C++ compiler could in theory do. I was not attempting to compare a Java JIT with a C++ compiler. But since you ask, a JIT has access to runtime information that can tell it, for example, "99.9%+ of the time this virtual call is made from this call site, the type is Base. Therefore it would be good to inline the function from Base". A profile-guided C++ optimizer could have the same information. I used this example because I'm pretty sure JITs have used it, I just don't know if they still do.Cairistiona
As you imply, there's no reason in principle why a C++ compiler can't emit self-modifying code that plays every trick in the JIT book to do runtime optimization. Actually I don't know of any C++ compiler that does this, and it is useful because it could be that depending on input, the dynamic type is 99.9% Foo or 99.9% Bar, and the fact that (some) JITs continue optimizing after runtime starts is what lets them optimize this run of the program, happening now on the user's machine. Profile-guided C++ compilers in my experience only optimize some standard development run of the program.Cairistiona
@Steve: there must be some threshold where run-time profiling is typically more expensive than the potential gains... my impression is that C++ is already on the "good" side of that divide, so non-academic compiler writers rarely bother with that kind of instrumentation. But they effort may also be limited by financing. JAVA and MS's C# rip off are well and truly on the other side, plus they've got huge corporate money behind them. And in select places, manual instrumentation's possible in C++ source: can't self modify so have an extra branch, but you can go from out-of-line to inline.Hodgepodge
P
1

The issues is that while Java compiles to code that runs on a virtual machine, that same guarantee can't be made for C++. It common to use C++ as a more organized replacement for C, and C has a 1:1 translation to assembly.

If you consider that 9 out of 10 microprocessors in the world are not in a personal computer or a smartphone, you'll see the issue when you further consider that there are a lot of processors that need this low level access.

C++ was designed to avoid that hidden deferencing if you didn't need it, thus keeping that 1:1 nature. Some of the first C++ code actually had an intermediate step of being translated to C before running through a C-to-assembly compiler.

Prominent answered 7/7, 2011 at 6:32 Comment(4)
C has a 1:1 translation to assembly? That would be a surprise. Which CPU has switch(foo) ? Hell, which CPU has for ? Most use a compare and branch instruction.Jemmie
maybe 1:1 is a bad way to put it... but there is typically a direct translation between c constructs and the resultant assembly code to the point where this was a main design feature of c at the time it was made. C++ was designed to maintain this relationship whenever possible.Prominent
Whether it's exceptions, templates or virtual functions, none of points where C++ differs most noticeably from C are at all related to CPU instructions. Not to mention the Standard Library. No, what you observe is really the consequence of a VM-less language. There's just less hidden code as a result.Jemmie
The point I was trying to make was that if you don't use those bits of c++ (templates, inheritance, etc), much of the resultant assembly turns out to be very similar to c. the only difference ends up being an implicit structure being passed into member functions, where as most complicated c code just passes the struct around explictly. For developers who use C because it is easy to translate it into various forms of assembly, C++ is an easy transition because of the "if you don't use it, you don't pay for it" nature. This was intentional to keep it compatible with C.Prominent
M
-5

Java method calls are far more efficient than C++ due to runtime optimization.

What we need is to compile C++ into bytecode and run it on JVM.

Mollymollycoddle answered 7/7, 2011 at 7:15 Comment(8)
Lol... is this for real? If you look at different language performance in most cases good C++ performs better than good Java equivalents.Kroll
The 2nd line is a joke. After all, JVM is written in C++. However, for most apps, with lots of OO abstractions, Java can do crazy optimizations at runtime to remove the overheads. cake and eat.Mollymollycoddle
Java can't do any optimisation that a C++ compiler can't. There's no magic. In fact, because C++'s type system is stronger it should be easier to write cunning whole-program optimisation because the compiler has more knowledge of the semantics.Biquadrate
@spraff: Java can't do any optimization that self-modifying binaries from C++ can't. But JVMs do perform optimizations that no C++ compiler actually does, because JVMs can and do optimize based on profile data from this exact run of the program, whereas no C++ implementation that I know of does that, PGO works off a run that some developer did back at HQ. JVMs use this advantage to somewhat compensate for the C++ compiler's advantages. It'd be interesting to see if a best-of-both C++ implementation were possible, but it certainly would not be as simple as compiling C++ to Java bytecode.Cairistiona
And in fact C++ does already get a little bit of the same kind of benefit, since some CPUs use profile data from this exact run of the program to do e.g. branch prediction. But this doesn't come through the efforts of the C++ compiler.Cairistiona
Let's just be clear that there might be a compliler advantage or a toolchain advantage but it's not a language advantage. Profiling optimisers for C++ do exist but people don't bother with them and/or they're commercial and closed. The cynic in me says that if Java does this more willingly it's because it has to compensate for being inherently slower in the first place. And let's not overlook the fact that a fast execution with added optimisation delay might be slower than a naive execution!Biquadrate
@spraff: sure, it depends whether you're talking about properties of the language specification, or properties of actual implementations that exist. It's not as if C++ optimizer-writers think to themselves "ah, this is fast enough, let's get down the pub", there's a real difference between PGO based on a profile of a single run prior to compile-time, and the code mutation that modern JITs do using profile data from this exact run. And agreed, sometimes optimization is counter-productive. It takes time to do it, and there may be pathological cases that make the "optimized" code slower.Cairistiona
@Mollymollycoddle I think you hit a nerve with the c++ programmers here. I for one found your answer humourous, though it should have been a comment.Plexiglas

© 2022 - 2024 — McMap. All rights reserved.