What is the Rule of Four (and a half)?
Asked Answered
D

3

30

For properly handling object copying, the rule of thumb is the Rule of Three. With C++11, move semantics are a thing, so instead it's the Rule of Five. However, in discussions around here and on the internet, I've also seen references to the Rule of Four (and a half), which is a combination of the Rule of Five and the copy-and-swap idiom.

So what exactly is the Rule of Four (and a half)? Which functions need to be implemented, and what should each function's body look like? Which function is the half? Are there any disadvantages or warnings for this approach, compared to the Rule of Five?

Here's a reference implementation that resembles my current code. If this is incorrect, what would a correct implementation look like?

#include <utility>

// "Handle management" functions. These would be defined in an external library.
typedef int* handle;
#define NO_HANDLE nullptr
extern handle get_handle(int value);
extern handle copy_handle(handle h);
extern void free_handle(handle h);

// This class automatically obtains a handle, and then frees it when the object
// leaves scope.
class AutoHandle {
public:
    //We must have a default constructor so we can swap during copy construction.
    //It need not be useful, but it should be swappable and deconstructable.
    //It can be private, if it's not truly a valid state for the object.
    AutoHandle() : resource(NO_HANDLE) {}
    
    //Normal constructor, acquire resource
    AutoHandle(int value) : resource(get_handle(value)) {}
    
    //Copy constructor
    AutoHandle(AutoHandle const& other) {
        resource = copy_handle(other.resource);
    }
    
    //Move constructor
    //Delegates to default constructor to put us in safe state.
    AutoHandle(AutoHandle&& other) : AutoHandle() {
        swap(other);
    }
    
    //Assignment
    AutoHandle& operator=(AutoHandle other) {
        swap(other);
        return *this;
    }
    
    //Destructor
    ~AutoHandle() {
        //Free the resource here.
        //We must handle the default state that can appear from the copy ctor.
        if (resource != NO_HANDLE)
            free_handle(resource);
    }
    
    //Swap
    void swap(AutoHandle& other) {
        using std::swap;
        
        //Swap the resource between instances here.
        swap(resource, other.resource);
    }
    
    //Swap for ADL
    friend void swap(AutoHandle& left, AutoHandle& right) {
        left.swap(right);
    }
    
private:
    handle resource;
};
Dermatoglyphics answered 18/8, 2017 at 10:17 Comment(11)
The if in if (resource != nullptr) delete resource; is not needed.Benner
@Benner I knew that, but decided to include it anyway to make it clear that the default state is safe to deconstruct. I edited to clarify.Dermatoglyphics
As you shouldn't use raw, owning pointers, there is rarely the need to write your own destructor. Quite frankly: I no longer in believe in either rule of 0, 3,4,5,6. I try to write my classes in a way that I only have to write as few special member functions as possible.Fonteyn
@Fonteyn that's the rule of 0Massasoit
@Fonteyn I agree with you. I use smart pointers and whatnot as much as possible. This question actually came about because I was refactoring some older code, and I do actually require some manual resource management. Since I still have to do it, I want to do it right.Dermatoglyphics
@Caleth: Not really. I'm not above writing e.g. just copy assignment and copy constructor if that suits my needs.Fonteyn
@jpfx1342: Sorry if that wasn't clear. My comment was meant as an answer to the question of what the rule of 4 and a half is (sometime you need the dtor -> 5 and sometimes you don't -> 4). As that only answers part of a question I didn't make it an answer.Fonteyn
As with most toy examples it is hard to draw any general conclusions. For example, why would anyone store an int* and then want deep copies? Having a "useless" default constructor seems like not the best idea, especially if it is just used in an odd move constructor - which will construct yet another object inside std::swap. Doesn't seem like an obvious speed optimization which the move constructor ought to be. Also the assignment operator possibly calling std::swap seems to be recursive when swap move assigns the objects...Karren
@BoPersson The int* is intended to represent a more complex resource, like a file, or a handle from some API. The Rule of 4 requires a valid default state so that you don't swap an unconstructed object into an object that's going to get it's destructor called. The default constructor could be private, I think, and it doesn't matter what it does, as long as the destructor can handle it. I think forwarding to std::swap in the assignment would make the move a copy (which makes it more like Rule of Three), but I don't think it would recurse. But I'm not sure! That's why I asked. :)Dermatoglyphics
From previous article: "To implement the Copy-Swap idiom your resource management class must also implement a swap() function to perform a member-by-member swap (there’s your “…(and a half)”)"Pogy
I tried to address some of the problems with my toy example by making it more clear that an external resource is being managed and commenting this a little better. This might make some of the above discussion nonsensical, but it was valid criticism at the time.Dermatoglyphics
P
20

So what exactly is the Rule of Four (and a half)?

“The Rule of The Big Four (and a half)" states that if you implement one of

  • The copy constructor
  • The assignment operator
  • The move constructor
  • The destructor
  • The swap function

then you must have a policy about the others.

Which functions need to implemented, and what should each function's body look like?

  • default constructor (which could be private)

  • copy constructor (deep copy of your resource. Here you have real code to handle your resource)

  • move constructor (using default constructor and swap) :

      S(S&& s) : S{} { swap(*this, s); }
    
  • assignment operator (using constructor and swap)

      S& operator=(S s)
      {
          swap(*this, s);
          return *this;
      }
    
  • destructor (release your resources)

  • friend swap (doesn't have default implementation :/ you should probably want to swap each member). This one is important contrary to the swap member method: std::swap uses move (or copy) constructor, which would lead to infinite recursion.

Which function is the half?

From previous article:

"To implement the Copy-Swap idiom your resource management class must also implement a swap() function to perform a member-by-member swap (there’s your “…(and a half)”)"

so the swap method.

Are there any disadvantages or warnings for this approach, compared to the Rule of Five?

The warning I already wrote is about to write the correct swap to avoid the infinite recursion.

Pogy answered 18/8, 2017 at 18:2 Comment(0)
L
11

Are there any disadvantages or warnings for this approach, compared to the Rule of Five?

Although it can save code duplication, using copy-and-swap simply results in worse classes, to be blunt. You are hurting your class' performance, including move assignment (if you use the unified assignment operator, which I'm also not a fan of), which should be very fast. In exchange, you get the strong exception guarantee, which seems nice at first. The thing is, that you can get the strong exception guarantee from any class with a simple generic function:

template <class T>
void copy_and_swap(T& target, T source) {
    using std::swap;
    swap(target, std::move(source));
}

And that's it. So people who need strong exception safety can get it anyway. And frankly, strong exception safety is quite a niche anyhow.

The real way to save code duplication is through the Rule of Zero: choose member variables so that you don't need to write any of the special functions. In real life, I'd say that 90+ % of the time I see special member functions, they could have easily been avoided. Even if your class does indeed have some kind of special logic required for a special member function, you are usually better off pushing it down into a member. Your logger class may need to flush a buffer in its destructor, but that's not a reason to write a destructor: write a small buffer class that handles the flushing and have that as a member of your logger. Loggers potentially have all kinds of other resources that can get handled automatically and you want to let the compiler automatically generate copy/move/destruct code.

The thing about C++ is that automatic generation of special functions is all or nothing, per function. That is the copy constructor (e.g.) either gets generated automatically, taking into account all members, or you have to write (and worse, maintain) it all by hand. So it strongly pushes you to an approach of pushing things downwards.

In cases where you are writing a class to manage a resource and need to deal with this, it should typically be: a) relatively small, and b) relatively generic/reusable. The former means that a bit of duplicated code isn't a big deal, and the latter means that you probably don't want to leave performance on the table.

In sum I strongly discourage using copy and swap, and using unified assignment operators. Try to follow the Rule of Zero, if you can't, follow the Rule of Five. Write swap only if you can make it faster than the generic swap (which does 3 moves), but usually you needn't bother.

Loricate answered 18/8, 2017 at 18:43 Comment(7)
Can you give any reference on the Rule of 4.5 being slower than straight Rule of 5? Playing around on Godbolt I can see that slightly different code is generated, but it's not clear either is obviously worse. And not all resource-managing classes are small; consider vector, which typically needs to manage its own resources.Heptarchy
(At least in situations where copying-and-destroying is about the same price as modifying in-place, which I realize isn't all situations. My specific motivating example which led me to this page today was smart pointers, which are cheap to copy in any case; for something like vector it can be an extra allocation/deallocation cycle and extra memory usage. Is that all you meant?)Heptarchy
@DanielH Long delay, but... The rule of 4.5 is slower because move assignment is a smaller, more limited operation, than swapping. A swap in terms of 3 moves is basically optimal, modulo ordering, specific machine instructions, etc, very low level details. Move assignment in terms of swap is clearly sub-optimal. Even for something like a unique_ptr which is just a raw pointer under the hood. Move assignment is just a single assignment (read old + write new), and another write (null old). swap is 3 assignments (read new + write temp, read old + write new, read temp + write old).Loricate
So it's 2 writes + 1 read, vs 3 writes + 3 reads. The real picture is more complicated than this of course (the last read has no real cost for example because it's already in a register), and the compiler may help you and optimize things out. But it's hard to rely on it for various reasons that would take more time to explain. Basically, at the end of the day, the rule of 4.5 is simply asking for more work to be done, some of which is needless (but which gives you strong exception guarantee).Loricate
@DanielH, Howard Hinnant has talked about the disadvantages of the copy-and-swap idiom. For e.g., #7458610 and also, in this talk: youtube.com/watch?v=vLinb2fgkHk (starting from "Can I define one special member in terms of another?")Odaodab
@Odaodab I havenʼt watched the video yet, but the only part of the SO answer or its comments that address this is “It can be sub-optimal, especially when applied without careful analysis.” and then the rest is about alternate methods; that doesnʼt explain why itʼs bad.Heptarchy
@DanielH I agree that he doesn't explain why copy-and-swap based unified assignment is less efficient than move assignment in that answer. I have watched the video only starting from the slide "Can I define one special member in terms of another?" (he mentioned about it as a comment in another SO question). He does present empirical evidence about how dedicated move assignment can be much faster when a class has a vector member. He shows this as a function of the capacity of the target vector. It is very interesting.Odaodab
P
1

In simple terms, just remember this.

Rule of 0:

Classes have neither custom destructors, copy/move constructors or copy/move assignment operators.

Rule of 3: If you implement a custom version of any of these, you implement all of them.

Destructor, Copy constructor, copy assignment

Rule of 5: If you implement a custom move constructor or the move assignment operator, you need to define all 5 of them. Needed for move semantics.

Destructor, Copy constructor, copy assignment, move constructor, move assignment

Rule of four and a half: Same as Rule of 5 but with copy and swap idiom. With the inclusion of the swap method, the copy assignment and move assignment merge into one assignment operator.

Destructor, Copy constructor, move constructor, assignment, swap (the half part)

Destructor: ~Class();
Copy constructor: Class(Class &);
Move constructor: Class(Class &&);
Assignment: Class & operator = (Class);
Swap: void swap(Class &);

There are no warnings, the advantage is that it is faster in assignment as a pass by value copy is actually more efficient than creating a temporary object in the body of the method.

And now that we have that temporary object, we simply perform a swap on the temporary object. It's automatically destroyed when it goes out of scope and we now have the value from the right-hand side of the operator in our object.

References:

https://www.linkedin.com/learning/c-plus-plus-advanced-topics/rule-of-five?u=67551194 https://en.cppreference.com/w/cpp/language/rule_of_three

Propel answered 21/6, 2021 at 6:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.