Heap-free pimpl. Incorrect or superstition?
Asked Answered
M

1

16

Edit: This question dates from before C++17. These days std::launder or equivalent should be added to the line noise. I don't have time to update the code to match right now.

I am aspiring to separate interface from implementation. This is primarily to protect code using a library from changes in the implementation of said library, though reduced compilation times are certainly welcome.

The standard solution to this is the pointer to implementation idiom, most likely to be implemented by using a unique_ptr and carefully defining the class destructor out of line, with the implementation.

Inevitably this raises concerns about heap allocation. I am familiar with "make it work, then make it fast", "profile then optimise" and such wisdom. There are also articles online, e.g. gotw, which declare the obvious workaround to be brittle and non-portable. I have a library which currently contains no heap allocations whatsoever - and I'd like to keep it that way - so let's have some code anyway.

#ifndef PIMPL_HPP
#define PIMPL_HPP
#include <cstddef>

namespace detail
{
// Keeping these up to date is unfortunate
// More hassle when supporting various platforms
// with different ideas about these values.
const std::size_t capacity = 24;
const std::size_t alignment = 8;
}

class example final
{
 public:
  // Constructors
  example();
  example(int);

  // Some methods
  void first_method(int);
  int second_method();

  // Set of standard operations
  ~example();
  example(const example &);
  example &operator=(const example &);
  example(example &&);
  example &operator=(example &&);

  // No public state available (it's all in the implementation)
 private:
  // No private functions (they're also in the implementation)
  unsigned char state alignas(detail::alignment)[detail::capacity];
};

#endif

This doesn't look too bad to me. Alignment and size can be statically asserted in the implementation. I can choose between overestimating both (inefficient) or recompiling everything if they change (tedious) - but neither option is terrible.

I'm not certain this sort of hackery will work in the presence of inheritance, but as I don't much like inheritance in interfaces I don't mind too much.

If we boldly assume that I've written the implementation correctly (I'll append it to this post, but it's an untested proof of concept at this point so that's not a given), and both size and alignment are greater than or equal to that of the implementation, then does the code exhibit implementation defined, or undefined, behaviour?

#include "pimpl.hpp"
#include <cassert>
#include <vector>

// Usually a class that has behaviour we care about
// In this example, it's arbitrary
class example_impl
{
 public:
  example_impl(int x = 0) { insert(x); }

  void insert(int x) { local_state.push_back(3 * x); }

  int retrieve() { return local_state.back(); }

 private:
  // Potentially exotic local state
  // For example, maybe we don't want std::vector in the header
  std::vector<int> local_state;
};

static_assert(sizeof(example_impl) == detail::capacity,
              "example capacity has diverged");

static_assert(alignof(example_impl) == detail::alignment,
              "example alignment has diverged");

// Forwarding methods - free to vary the names relative to the api
void example::first_method(int x)
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(this->state)));

  impl.insert(x);
}

int example::second_method()
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(this->state)));

  return impl.retrieve();
}

// A whole lot of boilerplate forwarding the standard operations
// This is (believe it or not...) written for clarity, so none call each other

example::example() { new (&state) example_impl{}; }
example::example(int x) { new (&state) example_impl{x}; }

example::~example()
{
  (reinterpret_cast<example_impl*>(&state))->~example_impl();
}

example::example(const example& other)
{
  const example_impl& impl =
      *(reinterpret_cast<const example_impl*>(&(other.state)));
  new (&state) example_impl(impl);
}

example& example::operator=(const example& other)
{
  const example_impl& impl =
      *(reinterpret_cast<const example_impl*>(&(other.state)));
  if (&other != this)
    {
      (reinterpret_cast<example_impl*>(&(this->state)))->~example_impl();
      new (&state) example_impl(impl);
    }
  return *this;
}

example::example(example&& other)
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(other.state)));
  new (&state) example_impl(std::move(impl));
}

example& example::operator=(example&& other)
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(other.state)));
  assert(this != &other); // could be persuaded to use an if() here
  (reinterpret_cast<example_impl*>(&(this->state)))->~example_impl();
  new (&state) example_impl(std::move(impl));
  return *this;
}

#if 0 // Clearer assignment functions due to MikeMB
example &example::operator=(const example &other) 
{
  *(reinterpret_cast<example_impl *>(&(this->state))) =
      *(reinterpret_cast<const example_impl *>(&(other.state)));
  return *this;
}   
example &example::operator=(example &&other) 
{
  *(reinterpret_cast<example_impl *>(&(this->state))) =
          std::move(*(reinterpret_cast<example_impl *>(&(other.state))));
  return *this;
}
#endif

int main()
{
  example an_example;
  example another_example{3};

  example copied(an_example);
  example moved(std::move(another_example));

  return 0;
}

I know that's pretty horrible. I don't mind using code generators though, so it's not something I'll have to type out repeatedly.

To state the crux of this over-long question explicitly, are the following conditions sufficient to avoid UB|IDB?

  • Size of state matches size of impl instance
  • Alignment of state matches alignment of impl instance
  • All five standard operations implemented in terms of those of the impl
  • Placement new used correctly
  • Explicit destructor calls used correctly

If they are, I'll write enough tests for Valgrind to flush out the several bugs in the demo. Thank you to any who get this far!

Mozza answered 17/11, 2015 at 23:28 Comment(11)
I generally agree with you: techniques like this are crucial for performance, especially in multi-threaded applications (the heap is typically guarded by some variety of mutual exclusion, and potentially many other threads: keeping stuff on the stack with this technique can mean the difference between a multithreaded app scaling and not).Barograph
Avoiding all heap allocations means avoiding all standard library containers. including std::string, plus avoiding using standard exceptions. That's pretty restrictive. But if it's important enough for that, then I think most certainly it's a good idea to use placement new. There is a GOTW about precisely this, however. Very much worth looking up.Hyponitrite
What is the point of all those reinterprete casts? You can just forward declare èxample_impl` and store the pointer, that was returned from placement new. Aside from this, im pretty sure your construct is leagle - although i can't cite the relevant part in the standardWilhite
Might be relevant: #20985308Wilhite
@Cheersandhth.-Alf No, avoiding all heap allocations means avoiding std::allocator. You can pass your own allocator to standard library containers that allocate on the stack.Apc
@orlp: Creating a standard allocator that allocates "on the stack" would be a really bad idea. At the point where the allocator gets to call e.g. "alloca", it's in some nested stack frame. Returning up then yields serious UB for use of the allocated chunk. But with knowledge of the allocation statistics in a given application it can be possible to do better than the default allocator. In particular one may use free lists for certain allocation sizes, or more general a small objects allocator as described in e.g. Alexandrescu's "Modern C++ Design". This is still heap allocation though.Hyponitrite
@Cheersandhth.-Alf No, no UB involved. See for example my implementation here: github.com/orlp/libop/blob/master/bits/memory.h#L45. You don't use alloca, you simply preallocate a maximum on the stack using char storage[N];.Apc
@orlp: A preallocated area to sub-allocate out of is hardly allocation "on the stack", regardless of whether it's on the machine stack or not (it's 100% irrelevant where that area resides). That's still heap allocation. It's your custom heap allocation.Hyponitrite
Hm, strange idea. pimpl used to hide data, because of hide of code is simple via static and anonymous namespaces, so in your case pimpl can only protect from shrinking of new pimpl implementation, what is unlikely happened, because of data have tendency to grow.Coulombe
@Wilhite the reinterp casts are definitely ugly. Difficult to hide in a free function too, since state is private. Forward declaring example_impl means leaking its existence in the header though. Storing a pointer to it in example would work, but means carrying around state at runtime instead of resolving at compile time.Mozza
@user1034749 the idea is to present a stable ABI. Depending on intended usage you might even present larger-than-required value for capacity to allow for future growth.Fiddlefaddle
W
5

Yes, this is perfectly safe and portable code.

However, there is no need to use placement new and explicit destruction in your assignment operators. Aside from it being exception safe and more efficient, I'd argue it's also much cleaner to just use the assignment operator of example_impl:

//wrapping the casts
const example_impl& castToImpl(const unsigned char* mem) { return *reinterpret_cast<const example_impl*>(mem);  }
      example_impl& castToImpl(      unsigned char* mem) { return *reinterpret_cast<      example_impl*>(mem);  }


example& example::operator=(const example& other)
{
    castToImpl(this->state) = castToImpl(other.state);
    return *this;
}

example& example::operator=(example&& other)
{
    castToImpl(this->state) = std::move(castToImpl(other.state));
    return *this;
}

Personally, I also would use std::aligned_storage instead of an manually aligned char array, but I guess thats a matter of taste.

Wilhite answered 22/11, 2015 at 20:56 Comment(5)
Usage of aligned_storage is NOT matter of taste, with reinterpret_cast and not properly align storage you can get crash even on such align tolerance platforms as x86/amd64, not mention others like sparc, arm etc. On x86/amd64 the problem may lay in usage of double in of method of example_impl, and emissions by compiler sse2 instructionsCoulombe
@user1034749: So why is this a problem? The alternative in the question is to use a properly aligned unsigned char array via the alignas operator - both solutions are correct.Wilhite
@user1034749 maybe you overlooked alignas(detail::alignment) in OP's code? std::aligned_storage is a syntactic-sugar wrapper for aligned unsigned char array or similar, it doesn't provide any functionality that the language didn't already offer.Fiddlefaddle
I think I'd rather leave the casts exposed than route them via an unsigned char *. Calling the impl's assignment operator is uncontroversially better though, I shall do so. Thank you.Mozza
@JonChesterfield: In the end, I'd assume that in both cases the function calls / local variables would be optimized away, so do whatever you preferWilhite

© 2022 - 2024 — McMap. All rights reserved.