What is wrong with "checking for self-assignment" and what does it mean?
Asked Answered
E

4

49

In Herb Sutter's book Exceptional C++ (1999), he has words in item 10's solution:

"Exception-unsafe" and "poor design" go hand in hand. If a piece of code isn't exception-safe, that's generally okay and can simply be fixed. But if a piece of code cannot be made exception-safe because of its underlying design, that almost always is a signal of its poor design.

Example 1: A function with two different responsibilities is difficult to make exception-safe.

Example 2: A copy assignment operator that is written in such a way that it must check for self-assignment is probably not strongly exception-safe either

What does he mean by the term "check for self-assignment"?

[INQUIRY]

Dave and AndreyT shows us exactly what "check for self-assignment" means. That's good. But the question is not over. Why does "check for self-assignment" hurts "exception safety"(according to Hurb Sutter)? If the caller tries to do self-assignment, that "check" works as if no assignment ever occurs. Does it really hurt?

[MEMO 1] In item 38 Object Identity later in Herb's book, he explains about self-assignment.

Embellish answered 18/8, 2012 at 1:47 Comment(1)
"in such a way that it must" the most important word here is: mustHouppelande
C
53

A question that's of greater importance in this case would be:


  • "What does it mean, when a you write function in a way, that requires you to check for self assignment???"

To answer my rhetorical question: It means that a well-designed assignment operator should not need to check for self-assignment. Assigning an object to itself should work correctly (i.e. have the end-effect of "doing nothing") without performing an explicit check for self-assignment.

For example, if I wanted to implement a simplistic array class along the lines of

class array {
    // code...
 
    int *data;
    size_t n;
};

...and came up with the following implementation of the assignment operator...

array &array::operator =(const array &rhs) 
{
    delete[] data;

    n = rhs.n;
    data = new int[n];

    std::copy_n(rhs.data, n, data);

    return *this;
}
That implementation would be considered "bad" since it obviously fails in case of self-assignment.

In order to "fix" this, you have two options;

  1. Add an explicit self-assignment check
array &array::operator =(const array &rhs) {
    if (&rhs != this) {
        delete[] data;

        n = rhs.n;
        data = new int[n];

        std::copy_n(rhs.data, n, data);
    }
    return *this;
}
  1. Follow a "check-less" approach:
array &array::operator =(const array &rhs) {
      size_t new_n = rhs.n;
      int *new_data = new int[new_n];

      std::copy_n(rhs.data, new_n, new_data);

      delete[] data;

      n = new_n;
      data = new_data;

      return *this;
}

The latter approach is better in a sense that it works correctly in self-assignment situations without making an explicit check. This implementation is far for perfect from a 'safety point of view', it is here to illustrate the difference between "checked" and "check-less" approaches to handling self-assignment. The later check-less implementation can be written more elegantly through the well-known copy-and-swap idiom.

This does not mean that you should avoid explicit checks for self-assignment. Such check do make sense from the performance point of view: there's no point in carrying out a long sequence of operations just to end up "doing nothing" in the end. But in a well-designed assignment operator such checks should not be necessary from the correctness point of view.

Caulicle answered 18/8, 2012 at 2:0 Comment(10)
"Such check do make sense from the performance point of view" Rally, self assignment often happens in your programs?Houppelande
@curiousguy: Depends on the program, on the algorithm. If it allows self-assignment, so be it. Branchless code always look more elegant than branched one, meaning that I'd allow self-assigment to happen in such cases instead of trying to catch and prevent it on the outside.Caulicle
OK. I should admit, understanding that "check for self-assignment" statement needs a strong context.Embellish
I think when having an if (this != &rhs) return *this; at the top will often result in a more easily verifiable as correct function. In your third scenario I find myself checking for correctness. Working under the assumption that self-assignment isn't the case, it's often a lot easier to reason about the code.Solute
@Solute - Don't you mean if (this == &rhs) return *this;Skvorak
@JohnBowers Exactly, made a typo.Solute
C++ Core Guidelines recommends check-less implementations for performance reasons because self-assignments rarely happens (see details) in real programs.Finbur
Wouldn't any move assignment operator NEED to check for self assignment?Slr
@Zebrafish: Well, just as described above, it depends on how it is implemented. If you implement move semantics using the move-is-swap idiom, then you don't formally need to check for self-assignment.Caulicle
@Finbur - the guidelines recommend check-less implementations for a class IIF the constituent members are also self-assignment-safe for efficiency purposes. It is not just for efficiency (given that it has a mandatory safety prerequisite for a self-check-less assignment to be implemented that way.)Imago
P
11

From c++ core guide

Foo& Foo::operator=(const Foo& a)   // OK, but there is a cost
{
    if (this == &a) return *this;
    s = a.s;
    i = a.i;
    return *this;
}

This is obviously safe and apparently efficient. However , what if we do one self-assignment per million assignments? That's about a million redundant tests (but since the answer is essentially always the same, the computer's branch predictor will guess right essentially every time). Consider:

Foo& Foo::operator=(const Foo& a)   // simpler, and probably much better
{
    s = a.s;
    i = a.i;
    return *this;
}

Note: The code above only apply to classes without pointer, for classes with pointer point to dynamic memory. Please refer to Ant's answer.

Ph answered 4/5, 2017 at 16:33 Comment(4)
err, what if something is dynamically allocated? you have to check for self-assignment then as the other answers have made clearLatoshalatouche
What if something is not dynamically allocated? That's the different aspects of the same thing.Ph
I agree, but you should make it clear that this is only a good optimization if nothing in your class is a pointer.Latoshalatouche
I add a comment as you suggested.Ph
M
3
MyClass& MyClass::operator=(const MyClass& other)  // copy assignment operator
{
    if(this != &other) // <-- self assignment check
    {
        // copy some stuff
    }

    return *this;
}

Assigning an object to itself is a mistake but it shouldn't logically result in your class instance changing. If you manage to design a class where assigning to itself changes it, it's poorly designed.

Morrell answered 18/8, 2012 at 1:51 Comment(4)
@Mooing Duck Don't edit my answer so drastically. I don't even agree with what you added - it's the easy way out and almost always sub-optimal. Sometimes that's okay, sometimes it's not. Write your own answer.Morrell
I figured the answer would be better if it also contained a sample of a good way to do things alongside the bad. Sorry if I stepped over the line. You're right that it's sub-optimal, but it's pretty standard, because it's fast enough, and hard to screw up. If you'd provide some bits on the right way to do assignment, I'd upvote. (doesn't have to be copy and swap, you're allowed to disagree with me)Exile
@MooingDuck I think almost all the time the right way is not to explicitly provide a copy assignment op. If you design your class well the default is all you'll ever want.Morrell
That should be in the answer then.Exile
E
2

The general reason to check for self-assignment is because you destroy your own data before copying in the new one. This assignment operator structure is also not strongly exception safe.

As an addendum, it was determined that self-assignment does not benefit performance at all, because the comparison must run every time but self-assignment is extremely rare, and if it does occur, this is a logical error in your program (really). This means that over the course of the program, it's just a waste of cycles.

Englert answered 18/8, 2012 at 3:4 Comment(7)
"this is a logical error in your program (really)." really, it isn't.Houppelande
Self-assignment is a meaningless statement. Meaningless statements are never intentionally written by programmers; and therefore always errors.Englert
No, self assignment should have no effect. It is not meaningless.Houppelande
Having no effect is by definition meaningless, as the program is identical before and after it's execution.Englert
Self-assignments aren't always obvious. It's totally reasonable to assume that many STL algorithms (partitioning, sorting and such) will do self-assignments on the elements.Homeopathy
A logical error. Unless, of course, you have machine-generated source code. Oops.Robertson
@Englert - a self-assignment is never an error if a design allows it to create a readable algorithm uncluttered by self-checks at the point the caller invokes the assignment operator. This is particularly true when the algorithm is performing "bulk" operations. The corollary of this design choice is that the assignment implementations must ensure they are safe. In essence, the obligation to prevent or allow (survive) self-checks is hidden from the caller (for the sake of providing a simplified API.) TImago

© 2022 - 2024 — McMap. All rights reserved.