Lazy/multi-stage construction in C++

Asked 28/1, 2010 at 8:22 Answered 28/1, 2010 at 11:10

Solved c++boost constructor lazy-initialization

What's a good existing class/design pattern for multi-stage construction/initialization of an object in C++?

I have a class with some data members which should be initialized in different points in the program's flow, so their initialization has to be delayed. For example one argument can be read from a file and another from the network.

Currently I am using boost::optional for the delayed construction of the data members, but it's bothering me that optional is semantically different than delay-constructed.

What I need reminds features of boost::bind and lambda partial function application, and using these libraries I can probably design multi-stage construction - but I prefer using existing, tested classes. (Or maybe there's another multi-stage construction pattern which I am not familiar with).

Stoner answered 28/1, 2010 at 8:22 Comment(4)

What exactly bothers you? Optional has an extra bool of state, is there any other difference? You certainly need that state if an error occurs and you need to destroy the master object before it runs to completion. – Digitalism 28/1, 2010 at 9:42

What bothers me is that "optional" is not the same as "must-be-constructed-but-perhaps-not-yet". A real difference would be that boost::optional objects do not throw exceptions when dereferenced. This makes sense semantically because the object is optional, so you should check if it is initialized before dereferencing it. Dereferencing a delay-constructed object however should throw an exception, just like any array class which is accessed out of bounds should throw an exception. – Stoner 28/1, 2010 at 10:25

Perhaps a simple code example would help clarify your question? – Publicist 28/1, 2010 at 10:28

To reassure you: IMO functors and the command pattern (lambda + bind) are perfectly sensible building blocks for delayed construction. You could use an object handle and a collection of functors (commands) that build said object. The collection of functors only trigger when a all required initialization functors are present. You might want to look at books.google.co.uk/… chapter 5. – Wolframite 28/1, 2010 at 11:0

The key issue is whether or not you should distinguish completely populated objects from incompletely populated objects at the type level. If you decide not to make a distinction, then just use boost::optional or similar as you are doing: this makes it easy to get coding quickly. OTOH you can't get the compiler to enforce the requirement that a particular function requires a completely populated object; you need to perform run-time checking of fields each time.

Parameter-group Types

If you do distinguish completely populated objects from incompletely populated objects at the type level, you can enforce the requirement that a function be passed a complete object. To do this I would suggest creating a corresponding type XParams for each relevant type X. XParams has boost::optional members and setter functions for each parameter that can be set after initial construction. Then you can force X to have only one (non-copy) constructor, that takes an XParams as its sole argument and checks that each necessary parameter has been set inside that XParams object. (Not sure if this pattern has a name -- anybody like to edit this to fill us in?)

"Partial Object" Types

This works wonderfully if you don't really have to do anything with the object before it is completely populated (perhaps other than trivial stuff like get the field values back). If you do have to sometimes treat an incompletely populated X like a "full" X, you can instead make X derive from a type XPartial, which contains all the logic, plus protected virtual methods for performing precondition tests that test whether all necessary fields are populated. Then if X ensures that it can only ever be constructed in a completely-populated state, it can override those protected methods with trivial checks that always return true:

class XPartial {
    optional<string> name_;

public:
    void setName(string x) { name_.reset(x); }  // Can add getters and/or ctors
    string makeGreeting(string title) {
        if (checkMakeGreeting_()) {             // Is it safe?
            return string("Hello, ") + title + " " + *name_;
        } else {
            throw domain_error("ZOINKS");       // Or similar
        }
    }
    bool isComplete() const { return checkMakeGreeting_(); }  // All tests here

protected:
    virtual bool checkMakeGreeting_() const { return name_; }   // Populated?
};

class X : public XPartial {
    X();     // Forbid default-construction; or, you could supply a "full" ctor

public:
    explicit X(XPartial const& x) : XPartial(x) {  // Avoid implicit conversion
        if (!x.isComplete()) throw domain_error("ZOINKS");
    }

    X& operator=(XPartial const& x) {
        if (!x.isComplete()) throw domain_error("ZOINKS");
        return static_cast<X&>(XPartial::operator=(x));
    }

protected:
    virtual bool checkMakeGreeting_() { return true; }   // No checking needed!
};

Although it might seem the inheritance here is "back to front", doing it this way means that an X can safely be supplied anywhere an XPartial& is asked for, so this approach obeys the Liskov Substitution Principle. This means that a function can use a parameter type of X& to indicate it needs a complete X object, or XPartial& to indicate it can handle partially populated objects -- in which case either an XPartial object or a full X can be passed.

Originally I had isComplete() as protected, but found this didn't work since X's copy ctor and assignment operator must call this function on their XPartial& argument, and they don't have sufficient access. On reflection, it makes more sense to publically expose this functionality.

Oestradiol answered 28/1, 2010 at 10:59 Comment(2)

Thanks random hacker, good answer. Have you seen this partial pattern used/implemented anywhere? – Stoner 28/1, 2010 at 12:33

I have used the "parameter-group" pattern several times myself to good effect, though I haven't seen it come up elsewhere. I'm sure it must be a "known pattern" of some kind though. I haven't actually used the "partial object" pattern myself, just came up with it now after thinking about it for a while, so YMMV ;) – Oestradiol 28/1, 2010 at 13:32

I must be missing something here - I do this kind of thing all the time. It's very common to have objects that are big and/or not needed by a class in all circumstances. So create them dynamically!

struct Big {
    char a[1000000];
};

class A {
  public: 
    A() : big(0) {}
   ~A() { delete big; }

   void f() {
      makebig();
      big->a[42] = 66;
   }
  private:
    Big * big;
    void makebig() {
      if ( ! big ) {
         big = new Big;
      }
    }
};

I don't see the need for anything fancier than that, except that makebig() should probably be const (and maybe inline), and the Big pointer should probably be mutable. And of course A must be able to construct Big, which may in other cases mean caching the contained class's constructor parameters. You will also need to decide on a copying/assignment policy - I'd probably forbid both for this kind of class.

Publicist answered 28/1, 2010 at 9:54 Comment(9)

This works, but I would suggest std::auto_ptr because of the added safety (works perfectly fine for in function or wherever). – Istle 28/1, 2010 at 10:6

Hmm, this doesn't Gather Parameters and delay construction till everything is present. – Wolframite 28/1, 2010 at 10:14

@Hassan Try as I might, I can't find that phrase in the question. – Publicist 28/1, 2010 at 10:16

@Travis I would suggest not - there is no added safety over the code I provided and it effectively prevents copying of A, should that be required. – Publicist 28/1, 2010 at 10:17

"delayed construction of the data members" and "so their initialization has to be delayed". "For example one argument can be read from a file and another from the network" implies gathering. – Wolframite 28/1, 2010 at 10:19

@Hassan My solution does delay construction and hence initialisation. As for gathering, that seems to be your reading. I admit my solution may not directly address the clarified question, though the clarification hasn't worked for me. – Publicist 28/1, 2010 at 10:27

@Neil He has given you far more information than the simple sentences I have highlighted for you .... he is already past your oversimplified hand-out.. he has used bind and lambda -- therefore what would your little code snippet give him -- it has nothing on lambda and bind. – Wolframite 28/1, 2010 at 10:32

@Hassan If you bothered to look at the question's edit history, you will see that those were added after I answered the question. But why are you wasting time commenting on my answer? Why not provide one yourself? – Publicist 28/1, 2010 at 10:36

@Neil because I know I cannot construct an elegant answer, since I always solve this issue in a purpose-specific way (last one was a custom serialization format and its IDL-- specifically reducing overhead of serialization by doing multi-stage construction). I am actually in the position to ask the same question that he has. – Wolframite 28/1, 2010 at 10:44

I don't know of any patterns to deal with this specific issue. It's a tricky design question, and one somewhat unique to languages like C++. Another issue is that the answer to this question is closely tied to your individual (or corporate) coding style.

I would use pointers for these members, and when they need to be constructed, allocate them at the same time. You can use auto_ptr for these, and check against NULL to see if they are initialized. (I think of pointers are a built-in "optional" type in C/C++/Java, there are other languages where NULL is not a valid pointer).

One issue as a matter of style is that you may be relying on your constructors to do too much work. When I'm coding OO, I have the constructors do just enough work to get the object in a consistent state. For example, if I have an Image class and I want to read from a file, I could do this:

image = new Image("unicorn.jpeg"); /* I'm not fond of this style */

or, I could do this:

image = new Image(); /* I like this better */
image->read("unicorn.jpeg");

It can get difficult to reason about how a C++ program works if the constructors have a lot of code in them, especially if you ask the question, "what happens if a constructor fails?" This is the main benefit of moving code out of the constructors.

I would have more to say, but I don't know what you're trying to do with delayed construction.

Edit: I remembered that there is a (somewhat perverse) way to call a constructor on an object at any arbitrary time. Here is an example:

class Counter {
public:
    Counter(int &cref) : c(cref) { }
    void incr(int x) { c += x; }
private:
    int &c;
};

void dontTryThisAtHome() {
    int i = 0, j = 0;
    Counter c(i);       // Call constructor first time on c
    c.incr(5);          // now i = 5
    new(&c) Counter(j); // Call the constructor AGAIN on c
    c.incr(3);          // now j = 3
}

Note that doing something as reckless as this might earn you the scorn of your fellow programmers, unless you've got solid reasons for using this technique. This also doesn't delay the constructor, just lets you call it again later.

Betel answered 28/1, 2010 at 8:42 Comment(2)

auto_ptr is a fine suggestion, but I disagree with your suggestion to favour do-(almost)-nothing ctors in the general case. These should be used only if the code needs lazy/delayed setup as the OP does; the usual choice should be to make the ctors require all necessary info. Allowing do-almost-nothing ctors means that every single method that actually does something must check whether it has been initialised "enough" yet -- so don't burden yourself with those checks unless you really have to. – Oestradiol 28/1, 2010 at 11:9

Also, using placement new to construct an object over the top of an existing object is dangerous because the destructor for the 1st object will never be run! So unless you're sure the class has a trivial dtor, this is almost never a good idea. – Oestradiol 28/1, 2010 at 11:12

Using boost.optional looks like a good solution for some use cases. I haven't played much with it so I can't comment much. One thing I keep in mind when dealing with such functionality is whether I can use overloaded constructors instead of default and copy constructors.

When I need such functionality I would just use a pointer to the type of the necessary field like this:

public:
  MyClass() : field_(0) { } // constructor, additional initializers and code omitted
  ~MyClass() {
    if (field_)
      delete field_; // free the constructed object only if initialized
  }
  ...
private:
  ...
  field_type* field_;

next, instead of using the pointer I would access the field through the following method:

private:
  ...
  field_type& field() {
    if (!field_)
      field_ = new field_type(...);
    return field_;
  }

I have omitted const-access semantics

Adeline answered 28/1, 2010 at 8:57 Comment(2)

You can pass 0 to delete, no if(field_) needed. – Baer 28/1, 2010 at 10:4

Thanks, I wasn't sure whether it is implementation-specific – Adeline 28/1, 2010 at 14:12

The easiest way I know is similar to the technique suggested by Dietrich Epp, except it allows you to truly delay the construction of an object until a moment of your choosing.

Basically: reserve the object using malloc instead of new (thereby bypassing the constructor), then call the overloaded new operator when you truly want to construct the object via placement new.

Example:

Object *x = (Object *) malloc(sizeof(Object));
//Use the object member items here. Be careful: no constructors have been called!
//This means you can assign values to ints, structs, etc... but nested objects can wreak havoc!

//Now we want to call the constructor of the object
new(x) Object(params);

//However, you must remember to also manually call the destructor!
x.~Object();
free(x);

//Note: if you're the malloc and new calls in your development stack 
//store in the same heap, you can just call delete(x) instead of the 
//destructor followed by free, but the above is the  correct way of 
//doing it

Personally, the only time I've ever used this syntax was when I had to use a custom C-based allocator for C++ objects. As Dietrich suggests, you should question whether you really, truly must delay the constructor call. The base constructor should perform the bare minimum to get your object into a serviceable state, whilst other overloaded constructors may perform more work as needed.

Hoar answered 28/1, 2010 at 9:14 Comment(2)

Placement new can be used for delayed construction, but the size dependence would be on the parameters used to construct the object -- not on the sizeof(object). The problem therefore is much wider and the general case has dependencies (possibly other then the size) on the parameters – Wolframite 28/1, 2010 at 11:57

Yes, as I mention it's highly dubious for nested objects. – Hoar 28/1, 2010 at 12:30

I don't know if there's a formal pattern for this. In places where I've seen it, we called it "lazy", "demand" or "on demand".

Gestation answered 28/1, 2010 at 11:10 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Parameter-group Types

"Partial Object" Types

Recommended topics

Hot tags