interface paradigm performance (dynamic binding vs. generic programming)
Asked Answered
E

2

8

While at their core dynamic binding and templates are fundamentally different things, they can be used to implement the same functionality.

Code example (only for reference)

A) dynamic binding

namespace DB {
  // interface
  class CustomCode {
    public:
      virtual void operator()(char) const = 0;
  };
  class Lib {
    public:
      void feature(CustomCode const& c) {
        c('d');
      }
  };

  // user code
  class MyCode1 : public CustomCode {
    public:
      void operator()(char i) const {
        std::cout << "1: " << i << std::endl;
      }
  };
  class MyCode2 : public CustomCode {
    public:
      void operator()(char i) const {
        std::cout << "2: " << i << std::endl;
      }
  };

  void use() {
    Lib lib;
    lib.feature(MyCode1());
    lib.feature(MyCode2());
  }
}

B) generic programming

namespace GP {
  //interface
  template <typename CustomCode> class Lib {
    public:
      void feature(CustomCode const& c) {
        c('g');
      }
  };

  // user code
  class MyCode1 {
    public:
      void operator()(char i) const {
        std::cout << "1: " << i << std::endl;
      }
  };
  class MyCode2 {
    public:
      void operator()(char i) const {
        std::cout << "2: " << i << std::endl;
      }
  };

  void use() {
    Lib<MyCode1> lib;
    lib.feature(MyCode1());
    //lib.feature(MyCode2());  <-- illegal
  }
}

Question

Some thoughts

While these paradigms are not identical and have their advantages and disadvantages (A is a bit more powerful (see MyCode2) and B is more flexible for the user) they both allow the implementation of the same functionality (while the limitations hinted above apply).

Anyway, in theory (TM) A is a bit slower at runtime because of the indirection of the virtual function, while B offers some great optimisation opportunities as methods can be inlined (and of course you don't have the indirection).
However, I often feel that A is a bit more self-documenting because you have a clear interface you have to implement (which usually consists of more than one method) while B is a bit more anarchistic (which implies its flexibility).

Core

  1. Are there any general results / comparative studies of these paradigms?
  2. Is the speed-up significant?
  3. What about compilation time?
  4. What are the design implications of either for interfaces in larger systems (I mainly used A for my inter-module interfaces and I haven't done really really big projects so far)?

Edit

Note: Saying "dynamic binding is better because it is more powerful" is not at all an answer, because the precondition is that you have a case where both approaches are applicable (otherwise there is no freedom to choose -- at least not reasonably).

Emmettemmey answered 18/9, 2011 at 22:32 Comment(3)
That's not metaprogramming, that's generic programming. Generic programming does make use of metaprogramming for some facilities.Nealon
@Luc Danton: What's the difference? I've always used both expressions interchangeably (apparently wrong).Emmettemmey
Surprisingly it's hard to find a relevant discussion on SO about the differences. Here's one but it's not that great. Very succinctly, std::vector is generic because it can work (at runtime) for any type that satisfies its requirements, while std::is_pointer is a metafunction that evaluates whether its argument type is a pointer or not. A good hint that your code is not MP is that it doesn't do anything useful at compile-time.Nealon
D
7

Are there any general results / comparative studies of these paradigms?

from what i have seen, many examples of proofs can be found in articles and publications. your favorite c++ books should provide several demonstrations; if you have no such resource, you may want to read Modern C++ Design: Generic Programming and Design Patterns Applied - A. Alexandrescu. although, there is not a specific resource that comes to mind that directly answers your question. as well, the result will vary by implementation and compiler - even compiler settings can greatly affect the outcome of such a test. (responding to each of your questions, although this does not qualify as an answer to this specific question).

Is the speed-up significant?

short answer: it depends.

in your example, the compiler could in fact use static dispatch or even inline the virtual function calls (enough information is visible to the compiler). i am now going to move the responses away from a trivial example (specifically, the OP) to larger, more complex programs.

expanding on 'it depends': yes, the speed up can range from unmeasurable to huge. you have to (and likely already) realize that the compiler can be provided an incredible amount of information at compilation via generics. it can then use this information to optimize your program much more accurately. one good example of this is the use of std::array vs std::vector. the vector adds flexibility at runtime, but the cost can be quite significant. the vector needs to implement more for resizing, the need for dynamic allocations can be costly. there are other differences: the backing allocation of the array will not change (++optimization), the element count is fixed (++optimization), and again - there's no need to call through new in many cases.

you may now be thinking this example has significantly deviated from the original question. in many ways, it's really not so different: the compiler knows more and more about your program as its complexity expands. this information can remove several portions of your program (dead code) and using std::array as an example, the information the type provides is enough such that a compiler can easily say "oh, i see that this array's size is seven elements, i will unroll the loop accordingly" and you will have fewer instructions and will have eliminated mispredictions. there's a lot more to it, but in the array/vector case, i have seen executable size of optimized programs reduce to 20% when converting from vector to an interface similar to array. as well, the code can perform several times faster. in fact, some expressions can be calculated entirely at compilation.

dynamic dispatch still has its virtues, and using dynamic dispatch can also improve your program's speed if used correctly - what you will really need to learn comes down to deciding when to favor one over the other. similar to how a huge function with many variables cannot be optimized very effectively (the result of all that template expansion in a real program), a virtual function call can in fact be a faster, cleaner approach in many situations. as such, they are two separate features, you will need some practice to determine what is right (and many programmers don't take the time to learn this well enough).

in conclusion, they should be regarded as separate features, applicable to different scenarios. these should (imho) have have much less practical overlap than they actually do in the real world.

What about compilation time?

with templates, the compilation and link times during development can be quite high. each time a header/template changes, you will require a compilation on all dependencies -- this can often be a significant boon for favoring dynamic dispatch. you can of course reduce this if you plan ahead and build out appropriately - understanding how is a much more difficult subject to master with templates. with templates, you not only increrase the frequency of large builds, you often increase the time and complexity of large builds. (more notes follow)

What are the design implications of either for interfaces in larger systems (I mainly used A for my inter-module interfaces and I haven't done really really big projects so far)?

it really depends on your program's expectations. i write virtual less every year (and many others as well). among other approaches, templates are becoming more and more common. honestly, i don't understand how B is 'anarchistic'. to me, A is a bit anachronistic as there are plenty of suitable alternatives. it's ultimately a design choice which can take a lot of consideration to architect large systems well. a good system will use a healthy combination of the language's features. history proves no feature in this discussion is necessary to write a nontrivial program, but all features were added because somebody saw better alternatives in some specific uses. you should also expect lambdas to replace virtuals in more than 50% of their current uses in some (not all) teams/codebases.

Generalizations:

  • templates can be significantly faster to execute, if used correctly.
  • either can produce larger executables. if used correctly and executable size is important, then the writer will use multiple approaches to reduce executable size while providing nice usable interfaces.
  • templates can grow very very complex. learning to crawl through and interpret error messages takes time.
  • templates push several errors into the compilation domain. personally, i favor a compilation error over a runtime error.
  • reducing compilation times via virtuals is often straightforward (virtuals belong in the .cpp). if your program is huge, then huge template systems that change often can quickly send your rebuild times and counts through the roof because there will be a lot of intermodule visibility and dependency.
  • deferred and/or selective instantiation with fewer compiled files can be used to reduce compile times.
  • in larger systems, you'll have to be much more considerate about forcing a nontrivial recompiles for your team/client. using virtuals are one approach to minimize this. as well, a higher percentage of your methods will be defined in the cpp. the alternative is of course that you can hide more of the implementation or provide to the client more expressive ways to use your interfaces.
  • in larger systems, templates, lambdas/functors, etc. can actually be used to significantly reduce coupling and dependencies.
  • virtuals increase dependency, often become difficult to maintain, bloat interfaces, and become structurally awkward beasts. template-centric libraries tend to invert that precedence.
  • all approaches can be used for the wrong reasons.

Bottom Line a large, well designed modern system will use many paradigms effectively and simultaneously. if you use virtuals most of the time at present, you are (imo) doing it wrong -- especially if that is still the approach once you've had time to absorb c++11. if speed, performance, and/or parallelism are also significant concerns, then templates and lambdas deserve to be your closer friends.

Derive answered 19/9, 2011 at 8:0 Comment(1)
This is a really through discussion. Thanks!Emmettemmey
G
1

Which is better? It depends. You've focused on the overlap. Better is to focus on where the approaches diverge. You also missed where you need to use both approaches simultaneously.

The biggest advantage of templates is that they offer the ability to cut down, sometimes immensely, on cookie-cutter code. Another advantage of templates is metaprogramming. There are some truly bizarre things you can do thanks to SFINAE.

One disadvantage of templates is that the syntax is a bit clunky. There is no way around this. It is what it is. Another disadvantage of templates is that each instantiation is a different class, completely unrelated to the other classes instantiated from the same template class. There is a way around this: Combine both approaches. Make your template class derive from some non-template base class. Of course, now you have lost some of the run time advantages.

The biggest advantage of polymorphism is that it is dynamic. This can be a huge, huge win. Don't discount it. There is a performance penalty for such polymorphism, but you are going to pay that penalty one way or another if you want to have a collection of objects that obey a common interface but different objects have different implementations for that interface.

Gerdy answered 18/9, 2011 at 23:30 Comment(4)
I'd say the biggest advantage of polymorphism is that it is dynamic, rather than static the way generic programming is.Carven
Point taken, answer edited. "The biggest advantage of polymorphism is polymorphism itself" is kinda lame.Gerdy
I very carefully avoided to ask *Which is better?" for a reason. Your discussion is sound, but I was also looking for some factual data. My motivation is: Given a situation where both approaches are applicable (in a pure fashion, i.e. without using the other one), is it reasonable to chose the one that better suits my laziness or would this have a general impact on performance (of course, I would have to run a benchmark in concrete situations where performance is really vital).Emmettemmey
In general, performance should be your the last concern. It collides with almost every other quality metric. Strive to make your code understandable, maintainable, testable, usable, extensible, thisable, and thatable. Then worry about performance. That said, when performance does matter, it really does matter. Every other metric has to take a back seat. But only for that tiny, tiny percent of your code that impacts performance.Gerdy

© 2022 - 2024 — McMap. All rights reserved.