Is the PIMPL idiom really used in practice?
Asked Answered
A

12

190

I am reading the book "Exceptional C++" by Herb Sutter, and in that book I have learned about the PIMPL idiom. Basically, the idea is to create a structure for the private objects of a class and dynamically allocate them to decrease the compilation time (and also hide the private implementations in a better manner).

For example:

class X
{
private:
  C c;
  D d;
} ;

could be changed to:

class X
{
private:
  struct XImpl;
  XImpl* pImpl;
};

and, in the .cpp file, the definition:

struct X::XImpl
{
  C c;
  D d;
};

This seems pretty interesting, but I have never seen this kind of approach before, neither in the companies I have worked, nor in open source projects that I've seen the source code. So, I am wondering whether this technique is really used in practice.

Should I use it everywhere, or with caution? And is this technique recommended to be used in embedded systems (where the performance is very important)?

Anana answered 23/1, 2012 at 13:43 Comment(22)
Is this essentially the same as a deciding that X is an (abstract) interface, and Ximpl is the implementation? struct XImpl : public X. That feels more natural to me. Is there some other issue I've missed?Roswell
@AaronMcDaid: It's similar, but has the advantages that (a) member functions don't have to be virtual, and (b) you don't need a factory, or the definition of the implementation class, to instantiate it.Sanburn
@AaronMcDaid The pimpl idiom avoids virtual function calls. It's also a bit more C++-ish (for some conception of C++-ish); you invoke constructors, rather than factory functions. I've used both, depending on what is in the existing code base---the pimpl idiom (originally called the Cheshire cat idiom, and predating Herb's description of it by at least 5 years) seems to have a longer history and be more widely used in C++, but otherwise, both work.Teddytedeschi
In C++, pimpl should be implemented with const unique_ptr<XImpl> rather than XImpl*.Photoreceptor
"never seen this kind of approach before, neither in the companies I have worked, nor in open source projects". Qt is hardly ever using it NOT.Maryjomaryl
Bjarne Stoustrup recommends the name _impl not _pimplPyemia
#2346663Lowboy
Yes, I did try it. The code in your most recent comment does not require the definition of XImpl or std::unique_ptr<XImpl>, so it will compile.Moreira
I did try it, to double-check. I am right. It doesn’t compile.Moreira
I don’t have a question. What question should I ask?Moreira
@KeithRussell Your question is "Do you need to define XImpl before declaring unique_ptr<XImpl> as a member of some class X?" You seem to think the answer is yes.Photoreceptor
I feel that this much back-and-forth is only diluting the comments section on this question. Instead of simply disagreeing with me, please replace the pointer in the OP’s example with a unique_ptr, then include X.h in a second Y.cpp file (not X.cpp, where the definition of XImpl lives). If your compiler lets you even get that far, try instantiating an X in that Y.cpp file.Moreira
@KeithRussell You are wrong. XImpl needs to only be defined in the implementation file; it can be simply declared as a class in the header file. Also, see this question: #9020872Photoreceptor
We are both wrong. That example contains a small-looking but critical difference: Unlike the OP’s code, GotW’s code has an explicit destructor for X, which prevents the destructor for unique_ptr<XImpl> from being defined inline in an implicit destructor for X. This is a great way to get around the problem, which definitely does occur in OP’s code if we replace the opaque pointer with a unique_ptr. (And I’m going to start using this trick — but I’ll comment it where I do, because it relies upon private details of STL.)Moreira
unique_ptr will generate a call to XImpl’s destructor regardless of whether we (explicitly) define a constructor for X. This is the point of using unique_ptr. The rub is just that XImpl’s destructor must be declared in any translation unit containing a definition (implicit or no) of X’s destructor, or else unique_ptr’s generation of that call will fail. For this reason, in order to preserve the PImpl pattern, X’s destructor must be explicit (and defined outside of the header).Moreira
Yes. What I was missing was that STL can define template classes “piecemeal”. As long as we hide all calls to most of unique_ptr’s constructors and to its destructor, we’re fine — and we can do this by suppressing the implicit definition of X’s constructor and destructor. In your original comment, you just mentioned using unique_ptr, but did not mention that OP’s code as written could not simply have its opaque pointer replaced with a unique_ptr — OP would also have to define X’s destructor.Moreira
@KeithRussell —which you have to do anyway whenever you're using the pimpl idiom.Photoreceptor
True. (Looking at it again, OP’s code lacks both a constructor and a destructor for his opaque-pointer version of X, even though it seems apparent he intends X to own pImpl in a RAII sense.)Moreira
Addendum: Implicit, generated assignment operator and copy-constructor must also be suppressed, as with the implicit default constructor and destructor.Moreira
@NeilG now I guess the recommended way is to use std::experimental::propagate_const<std::unique_ptr<impl>> pImplAmbitendency
@KeithRussell the generated assignment operator and copy-constructors will be deleted simply due to unique_ptr not having available assignment operator / copy constructor. And the move assignment operator and move constructor are not implicitly declared when there is a user declared destructor in the class. So I think the GotW example is, unsurprisingly, correct. But in general using a unique_ptr on an incomplete type seems like dangerous territory.Janellejanene
@AndyBorrell Don’t fear it! I’ve been doing it for years since this question, and I love it. Can’t do it in a lot of cases, but the toolchain will tell you when that is.Moreira
H
152

So, I am wondering it this technique is really used in practice? Should I use it everywhere, or with caution?

Of course it is used. I use it in my project, in almost every class.


Reasons for using the PIMPL idiom:

Binary compatibility

When you're developing a library, you can add/modify fields to XImpl without breaking the binary compatibility with your client (which would mean crashes!). Since the binary layout of X class doesn't change when you add new fields to Ximpl class, it is safe to add new functionality to the library in minor versions updates.

Of course, you can also add new public/private non-virtual methods to X/XImpl without breaking the binary compatibility, but that's on par with the standard header/implementation technique.

Data hiding

If you're developing a library, especially a proprietary one, it might be desirable not to disclose what other libraries / implementation techniques were used to implement the public interface of your library. Either because of Intellectual Property issues, or because you believe that users might be tempted to take dangerous assumptions about the implementation or just break the encapsulation by using terrible casting tricks. PIMPL solves/mitigates that.

Compilation time

Compilation time is decreased, since only the source (implementation) file of X needs to be rebuilt when you add/remove fields and/or methods to the XImpl class (which maps to adding private fields/methods in the standard technique). In practice, it's a common operation.

With the standard header/implementation technique (without PIMPL), when you add a new field to X, every client that ever allocates X (either on stack, or on heap) needs to be recompiled, because it must adjust the size of the allocation. Well, every client that doesn't ever allocate X also need to be recompiled, but it's just overhead (the resulting code on the client side will be the same).

What is more, with the standard header/implementation separation XClient1.cpp needs to be recompiled even when a private method X::foo() was added to X and X.h changed, even though XClient1.cpp can't possibly call this method for encapsulation reasons! Like above, it's pure overhead and is related with how real-life C++ build systems work.

Of course, recompilation is not needed when you just modify the implementation of the methods (because you don't touch the header), but that's on par with the standard header/implementation technique.


Is this technique recommended to be used in embedded systems (where the performance is very important)?

That depends on how powerful your target is. However the only answer to this question is: measure and evaluate what you gain and lose. Also, take into consideration that if you're not publishing a library meant to be used in embedded systems by your clients, only the compilation time advantage applies!

Homeless answered 23/1, 2012 at 13:56 Comment(5)
also, binary compatibilityKolnick
In the Qt library this method is also used in smart pointer situations. So QString keeps its contents as a immutable class internally. When the public class is "copied" the private member's pointer is copied instead of the entire private class. These private classes then also use smart pointers, so you basically get garbage collection with most of the classes, in addition to greatly improved performance due to pointer copying instead of full class copyingWhitethorn
Even more, with the pimpl idiom Qt can maintain both forward and back binary compatibility within a single major version (in most cases). IMO this is by far the most significant reason to use it.Bierce
Also less chance of namespace collision in the headers, since most of the includes needed for data members can be moved into the impl definitionSaturn
It is also useful for implementing platform-specific code, as you can retain same API.Dysteleology
C
60

It seems that a lot of libraries out there use it to stay stable in their API, at least for some versions.

But as for all things, you should never use anything everywhere without caution. Always think before using it. Evaluate what advantages it gives you, and if they are worth the price you pay.

The advantages it may give you are:

  • helps in keeping binary compatibility of shared libraries
  • hiding certain internal details
  • decreasing recompilation cycles

Those may or may not be real advantages to you. Like for me, I don't care about a few minutes recompilation time. End users usually also don't, as they always compile it once and from the beginning.

Possible disadvantages are (also here, depending on the implementation and whether they are real disadvantages for you):

  • Increase in memory usage due to more allocations than with the naïve variant
  • increased maintenance effort (you have to write at least the forwarding functions)
  • performance loss (the compiler may not be able to inline stuff as it is with a naïve implementation of your class)

So carefully give everything a value, and evaluate it for yourself. For me, it almost always turns out that using the PIMPL idiom is not worth the effort. There is only one case where I personally use it (or at least something similar):

My C++ wrapper for the Linux stat call. Here the struct from the C header may be different, depending on what #defines are set. And since my wrapper header can't control all of them, I only #include <sys/stat.h> in my .cxx file and avoid these problems.

Chevrotain answered 23/1, 2012 at 14:4 Comment(2)
It should almost always be used for system interfaces, to make the interface code system independent. My File class (which exposes much of the information stat would return under Unix) uses the same interface under both Windows and Unix, for example.Teddytedeschi
@JamesKanze: Even there I personally would first sit for a moment and think about if it is not maybe sufficient to have a few #ifdefs to make the wrapper as thin as possible. But everyone has different goals, the important thing is to take the time to think about it instead of blindly following something.Chevrotain
H
35

I agree with all the others about the goods, but let me put in evidence about a limit: doesn't work well with templates.

The reason is that template instantiation requires the full declaration available where the instantiation took place. (And that's the main reason you don't see template methods defined into .cpp files.)

You can still refer to templatised subclasses, but since you have to include them all, every advantage of "implementation decoupling" on compiling (avoiding to include all platform-specific code everywhere, shortening compilation) is lost.

It is a good paradigm for classic OOP (inheritance based), but not for generic programming (specialization based).

Hakeem answered 23/1, 2012 at 15:21 Comment(5)
You have to be more precise: there's absolutely no problem when using PIMPL classes as template type arguments. Only if the implementation class itself needs to be parametrized on the outer class's template arguments, it can't be hidden from the interface header anymore, even if it's still a private class. If you can delete the template argument, you can certainly still do "proper" PIMPL. With type deletion you can also do the PIMPL in a base non-template class, and then have the template class derive from it.Polymath
You can have a lot of template code in .cpp files with explicit template instanciation ;-)Mensch
@Mensch When you write a template library you don't know what instantiation your users can do.Hakeem
@EmilioGaravaglia, but often you know it can not work with every type. If your code works only with a limited and known list of types, it's a good way to not show the code.Mensch
I'm working on a sudoku library that does this. The template parameter is a non-type template parameter specifying the size of the sudoku grid to operate on. The entire library is basically duplicated via explicitly instantiated templates for each compiled size (all compiled sizes need to be specified up front). It's a very niche use-case, but I thought it would be worth mentioning as an example of when this might actually be done.Patsypatt
A
26

Other people have already provided the technical up/downsides, but I think the following is worth noting:

First and foremost, don't be dogmatic. If PIMPL works for your situation, use it - don't use it just because "it's better OO since it really hides implementation", etc. Quoting the C++ FAQ:

encapsulation is for code, not people (source)

Just to give you an example of open source software where it is used and why: OpenThreads, the threading library used by the OpenSceneGraph. The main idea is to remove from the header (e.g., <Thread.h>) all platform-specific code, because internal state variables (e.g., thread handles) differ from platform to platform. This way one can compile code against your library without any knowledge of the other platforms' idiosyncrasies, because everything is hidden.

Aime answered 23/1, 2012 at 14:8 Comment(0)
I
12

I would mainly consider PIMPL for classes exposed to be used as an API by other modules. This has many benefits, as it makes recompilation of the changes made in the PIMPL implementation does not affect the rest of the project. Also, for API classes they promote a binary compatibility (changes in a module implementation do not affect clients of those modules, they don't have to be recompiled as the new implementation has the same binary interface - the interface exposed by the PIMPL).

As for using PIMPL for every class, I would consider caution because all those benefits come at a cost: an extra level of indirection is required in order to access the implementation methods.

Ithyphallic answered 23/1, 2012 at 14:4 Comment(3)
"an extra level of indirection is required in order to access the implementation methods." It is?Descry
@Descry yes, it is. pimpl is slower if the methods are low level. never use it for stuff that lives in a tight loop, for example.Addlebrained
@Descry I would say in general case an extra level is required. If inlining is performed than no. But inlinning would not be an option in code compiled in a different dll.Ithyphallic
N
8

I think this is one of the most fundamental tools for decoupling.

I was using PIMPL (and many other idioms from Exceptional C++) on embedded project (SetTopBox).

The particular purpose of this idiom in our project was to hide the types XImpl class uses. Specifically, we used it to hide details of implementations for different hardware, where different headers would be pulled in. We had different implementations of XImpl classes for one platform and different for the other. Layout of class X stayed the same regardless of the platform.

Northernmost answered 23/1, 2012 at 13:59 Comment(1)
This is the main reason to use PIMPL. It resolves the "N*M" problem. I don't know why other answers don't list this as an advantage. Presumably the authors didn't know about it. But this is the main purpose of PIMPL. Other aspects are only relevant to writing libraries where binary compatiability is a requirement. I don't personally see "it speeds up compile time" as being a legitimate point, since it also "slows down development time and gives anyone trying to maintain your code in future a headache".Mahratta
S
6

I used to use this technique a lot in the past but then found myself moving away from it.

Of course it is a good idea to hide the implementation detail away from the users of your class. However you can also do that by getting users of the class to use an abstract interface and for the implementation detail to be the concrete class.

The advantages of pImpl are:

  1. Assuming there is just one implementation of this interface, it is clearer by not using abstract class / concrete implementation

  2. If you have a suite of classes (a module) such that several classes access the same "impl" but users of the module will only use the "exposed" classes.

  3. No v-table if this is assumed to be a bad thing.

The disadvantages I found of pImpl (where abstract interface works better)

  1. Whilst you may have only one "production" implementation, by using an abstract interface you can also create a "mock" inmplementation that works in unit testing.

  2. (The biggest issue). Before the days of unique_ptr and moving you had restricted choices as to how to store the pImpl. A raw pointer and you had issues about your class being non-copyable. An old auto_ptr wouldn't work with forwardly declared class (not on all compilers anyway). So people started using shared_ptr which was nice in making your class copyable but of course both copies had the same underlying shared_ptr which you might not expect (modify one and both are modified). So the solution was often to use raw pointer for the inner one and make the class non-copyable and return a shared_ptr to that instead. So two calls to new. (Actually 3 given old shared_ptr gave you a second one).

  3. Technically not really const-correct as the constness isn't propagated through to a member pointer.

In general I have therefore moved away in the years from pImpl and into abstract interface usage instead (and factory methods to create instances).

Sebaceous answered 10/8, 2015 at 16:16 Comment(0)
T
4

Here is an actual scenario I encountered, where this idiom helped a great deal. I recently decided to support DirectX 11, as well as my existing DirectX 9 support, in a game engine.

The engine already wrapped most DX features, so none of the DX interfaces were used directly; they were just defined in the headers as private members. The engine uses DLL files as extensions, adding keyboard, mouse, joystick, and scripting support, as week as many other extensions. While most of those DLLs did not use DX directly, they required knowledge and linkage to DX simply because they pulled in headers that exposed DX. In adding DX 11, this complexity was to increase dramatically, however unnecessarily. Moving the DX members into a PIMPL, defined only in the source, eliminated this imposition.

On top of this reduction of library dependencies, my exposed interfaces became cleaner as I moved private member functions into the PIMPL, exposing only front facing interfaces.

Tenebrific answered 3/11, 2016 at 19:46 Comment(0)
G
3

As many other said, the Pimpl idiom allows to reach complete information hiding and compilation independency, unfortunately with the cost of performance loss (additional pointer indirection) and additional memory need (the member pointer itself). The additional cost can be critical in embedded software development, in particular in those scenarios where memory must be economized as much as possible. Using C++ abstract classes as interfaces would lead to the same benefits at the same cost. This shows actually a big deficiency of C++ where, without recurring to C-like interfaces (global methods with an opaque pointer as parameter), it is not possible to have true information hiding and compilation independency without additional resource drawbacks: this is mainly because the declaration of a class, which must be included by its users, exports not only the interface of the class (public methods) needed by the users, but also its internals (private members), not needed by the users.

Gentility answered 4/9, 2015 at 11:13 Comment(0)
V
2

It is used in practice in a lot of projects. It's usefulness depends heavily on the kind of project. One of the more prominent projects using this is Qt, where the basic idea is to hide implementation or platform-specific code from the user (other developers using Qt).

This is a noble idea, but there is a real drawback to this: debugging As long as the code hidden in private implementations is of premium quality this is all well, but if there are bugs in there, then the user/developer has a problem, because it just a dumb pointer to a hidden implementation, even if he/she has the implementations source code.

So as in nearly all design decisions there are pros and cons.

Verona answered 23/1, 2012 at 14:8 Comment(2)
it's dumb but it's typed... why can't you follow the code in the debugger?Rotative
Generally speaking, to debug into Qt code you need to build Qt yourself. Once you do, there's no problem stepping into PIMPL methods, and inspecting contents of the PIMPL data.Polymath
J
2

One benefit I can see is that it allows the programmer to implement certain operations in a fairly fast manner:

X( X && move_semantics_are_cool ) : pImpl(NULL) {
    this->swap(move_semantics_are_cool);
}
X& swap( X& rhs ) {
    std::swap( pImpl, rhs.pImpl );
    return *this;
}
X& operator=( X && move_semantics_are_cool ) {
    return this->swap(move_semantics_are_cool);
}
X& operator=( const X& rhs ) {
    X temporary_copy(rhs);
    return this->swap(temporary_copy);
}

PS: I hope I'm not misunderstanding move semantics.

Jewish answered 21/9, 2016 at 1:45 Comment(0)
M
1

I thought I would add an answer because although some authors hinted at this, I didn't think the point was made clear enough.

The primary purpose of PIMPL is to solve the N*M problem. This problem may have other names in other literature, however a brief summary is this.

You have some kind of inhertiance hierachy where if you were to add a new subclass to your hierachy, it would require you to implement N or M new methods.

This is only an approximate hand-wavey explanation, because I only recently became aware of this and so I am by my own admission not yet an expert on this.

Discussion of existing points made

However I came across this question, and similar questions a number of years ago, and I was confused by the typical answers which are given. (Presumably I first learned about PIMPL some years ago and found this question and others similar to it.)

  1. Enables binary compatiability (when writing libraries)
  2. Reduces compile time
  3. Hides data

Taking into account the above "advantages", none of them are a particularly compelling reason to use PIMPL, in my opinion. Hence I have never used it, and my program designs suffered as a consequence because I discarded the utility of PIMPL and what it can really be used to accomplish.

Allow me to comment on each to explain:

1.

Binary compatiability is only of relevance when writing libraries. If you are compiling a final executable program, then this is of no relevance, unless you are using someone elses (binary) libraries. (In other words, you do not have the original source code.)

This means this advantage is of limited scope and utility. It is only of interest to people who write libraries which are shipped in proprietary form.

2.

I don't personally consider this to be of any relevance in the modern day when it is rare to be working on projects where the compile time is of critical importance. Maybe this is important to the developers of Google Chrome. The associated disadvantages which probably increase development time significantly probably more than offset this advantage. I might be wrong about this but I find it unlikely, especially given the speed of modern compilers and computers.

3.

I don't immediatly see the advantage that PIMPL brings here. The same result can be accomplished by shipping a header file and a binary object file. Without a concrete example in front of me it is difficult to see why PIMPL is relevant here. The relevant "thing" is shipping binary object files, rather than original source code.

What PIMPL actually does:

You will have to forgive my slightly hand-wavey answer. While I am not a complete expert in this particular area of software design, I can at least tell you something about it. This information is mostly repeated from Design Patterns. The authors call it "Bridge Pattern" aka Handle aka Body.

In this book, the example of writing a Window manager is given. The key point here is that a window manager can implement different types of windows as well as different types of platform.

For example, one may have a

  • Window
  • Icon window
  • Fullscreen window with 3d acceleration
  • Some other fancy window
  • These are types of windows which can be rendered

as well as

  • Microsoft Windows implementation
  • OS X platform implementation
  • Linux X Window Manger
  • Linux Wayland
  • These are different types of rendering engines, with different OS calls and possibly fundamentally different functionality as well

The list above is analagous to that given in another answer where another user described writing software which should work with different kinds of hardware for something like a DVD player. (I forget exactly what the example was.)

I give slightly different examples here compared to what is written in the Design Patterns book.

The point being that there are two seperate types of things which should be implemented using an inheritance hierachy, however using a single inheritance hierachy does not suffice here. (N*M problem, the complexity scales like the square of the number of things in each bullet point list, which is not feasible for a developer to implement.)

Hence, using PIMPL, one seperates out the types of windows and provides a pointer to an instance of an implementation class.

So PIMPL:

  • Solves the N*M problem
  • Decouples two fundamentally different things which are being modelled using inheritance such that there are 2 or more hierachies, rather than just one monolith
  • Permits runtime exchange of the exact implementation behaviour (by changing a pointer). This may be advantagous in some situations, whereas a single monolith enforces static (compile time) behaviour selection rather than runtime behaviour selection

There may be other ways to implement this, for example with multiple inheritance, but this is usually a more complicated and difficult approach, at least in my experience.

Mahratta answered 28/12, 2021 at 18:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.