What is wrong with "checking for self-assignment" and what does it mean?

Asked 18/8, 2012 at 1:47 Answered 4/5, 2017 at 16:33

Solved c++exception assignment-operator exception-safety

In Herb Sutter's book Exceptional C++ (1999), he has words in item 10's solution:

"Exception-unsafe" and "poor design" go hand in hand. If a piece of code isn't exception-safe, that's generally okay and can simply be fixed. But if a piece of code cannot be made exception-safe because of its underlying design, that almost always is a signal of its poor design.

Example 1: A function with two different responsibilities is difficult to make exception-safe.

Example 2: A copy assignment operator that is written in such a way that it must check for self-assignment is probably not strongly exception-safe either

What does he mean by the term "check for self-assignment"?

[INQUIRY]

Dave and AndreyT shows us exactly what "check for self-assignment" means. That's good. But the question is not over. Why does "check for self-assignment" hurts "exception safety"(according to Hurb Sutter)? If the caller tries to do self-assignment, that "check" works as if no assignment ever occurs. Does it really hurt?

[MEMO 1] In item 38 Object Identity later in Herb's book, he explains about self-assignment.

Embellish answered 18/8, 2012 at 1:47 Comment(1)

"in such a way that it must" the most important word here is: must – Houppelande 18/8, 2012 at 3:35

A question that's of greater importance in this case would be:

"What does it mean, when a you write function in a way, that requires you to check for self assignment???"

To answer my rhetorical question: It means that a well-designed assignment operator should not need to check for self-assignment. Assigning an object to itself should work correctly (i.e. have the end-effect of "doing nothing") without performing an explicit check for self-assignment.

For example, if I wanted to implement a simplistic array class along the lines of

class array {
    // code...
 
    int *data;
    size_t n;
};

...and came up with the following implementation of the assignment operator...

array &array::operator =(const array &rhs) 
{
    delete[] data;

    n = rhs.n;
    data = new int[n];

    std::copy_n(rhs.data, n, data);

    return *this;
}

That implementation would be considered "bad" since it obviously fails in case of self-assignment.

In order to "fix" this, you have two options;

Add an explicit self-assignment check

array &array::operator =(const array &rhs) {
    if (&rhs != this) {
        delete[] data;

        n = rhs.n;
        data = new int[n];

        std::copy_n(rhs.data, n, data);
    }
    return *this;
}

Follow a "check-less" approach:

array &array::operator =(const array &rhs) {
      size_t new_n = rhs.n;
      int *new_data = new int[new_n];

      std::copy_n(rhs.data, new_n, new_data);

      delete[] data;

      n = new_n;
      data = new_data;

      return *this;
}

The latter approach is better in a sense that it works correctly in self-assignment situations without making an explicit check. This implementation is far for perfect from a 'safety point of view', it is here to illustrate the difference between "checked" and "check-less" approaches to handling self-assignment. The later check-less implementation can be written more elegantly through the well-known copy-and-swap idiom.

This does not mean that you should avoid explicit checks for self-assignment. Such check do make sense from the performance point of view: there's no point in carrying out a long sequence of operations just to end up "doing nothing" in the end. But in a well-designed assignment operator such checks should not be necessary from the correctness point of view.

Caulicle answered 18/8, 2012 at 2:0 Comment(10)

"Such check do make sense from the performance point of view" Rally, self assignment often happens in your programs? – Houppelande 18/8, 2012 at 3:40

@curiousguy: Depends on the program, on the algorithm. If it allows self-assignment, so be it. Branchless code always look more elegant than branched one, meaning that I'd allow self-assigment to happen in such cases instead of trying to catch and prevent it on the outside. – Caulicle 18/8, 2012 at 4:32

OK. I should admit, understanding that "check for self-assignment" statement needs a strong context. – Embellish 22/4, 2013 at 1:20

I think when having an if (this != &rhs) return *this; at the top will often result in a more easily verifiable as correct function. In your third scenario I find myself checking for correctness. Working under the assumption that self-assignment isn't the case, it's often a lot easier to reason about the code. – Solute 6/5, 2016 at 1:22

@Solute - Don't you mean if (this == &rhs) return *this; – Skvorak 13/5, 2016 at 20:48

@JohnBowers Exactly, made a typo. – Solute 14/5, 2016 at 11:9

C++ Core Guidelines recommends check-less implementations for performance reasons because self-assignments rarely happens (see details) in real programs. – Finbur 28/8, 2019 at 7:26

Wouldn't any move assignment operator NEED to check for self assignment? – Slr 22/8, 2021 at 23:17

@Zebrafish: Well, just as described above, it depends on how it is implemented. If you implement move semantics using the move-is-swap idiom, then you don't formally need to check for self-assignment. – Caulicle 25/8, 2021 at 19:56

@Finbur - the guidelines recommend check-less implementations for a class IIF the constituent members are also self-assignment-safe for efficiency purposes. It is not just for efficiency (given that it has a mandatory safety prerequisite for a self-check-less assignment to be implemented that way.) – Imago 21/7, 2022 at 18:11

From c++ core guide

Foo& Foo::operator=(const Foo& a)   // OK, but there is a cost
{
    if (this == &a) return *this;
    s = a.s;
    i = a.i;
    return *this;
}

This is obviously safe and apparently efficient. However , what if we do one self-assignment per million assignments? That's about a million redundant tests (but since the answer is essentially always the same, the computer's branch predictor will guess right essentially every time). Consider:

Foo& Foo::operator=(const Foo& a)   // simpler, and probably much better
{
    s = a.s;
    i = a.i;
    return *this;
}

Note: The code above only apply to classes without pointer, for classes with pointer point to dynamic memory. Please refer to Ant's answer.

Ph answered 4/5, 2017 at 16:33 Comment(4)

err, what if something is dynamically allocated? you have to check for self-assignment then as the other answers have made clear – Latoshalatouche 8/8, 2018 at 17:5

What if something is not dynamically allocated? That's the different aspects of the same thing. – Ph 9/8, 2018 at 14:53

I agree, but you should make it clear that this is only a good optimization if nothing in your class is a pointer. – Latoshalatouche 9/8, 2018 at 17:55

I add a comment as you suggested. – Ph 9/8, 2018 at 21:11

MyClass& MyClass::operator=(const MyClass& other)  // copy assignment operator
{
    if(this != &other) // <-- self assignment check
    {
        // copy some stuff
    }

    return *this;
}

Assigning an object to itself is a mistake but it shouldn't logically result in your class instance changing. If you manage to design a class where assigning to itself changes it, it's poorly designed.

Morrell answered 18/8, 2012 at 1:51 Comment(4)

@Mooing Duck Don't edit my answer so drastically. I don't even agree with what you added - it's the easy way out and almost always sub-optimal. Sometimes that's okay, sometimes it's not. Write your own answer. – Morrell 18/8, 2012 at 1:59

I figured the answer would be better if it also contained a sample of a good way to do things alongside the bad. Sorry if I stepped over the line. You're right that it's sub-optimal, but it's pretty standard, because it's fast enough, and hard to screw up. If you'd provide some bits on the right way to do assignment, I'd upvote. (doesn't have to be copy and swap, you're allowed to disagree with me) – Exile 18/8, 2012 at 2:13

@MooingDuck I think almost all the time the right way is not to explicitly provide a copy assignment op. If you design your class well the default is all you'll ever want. – Morrell 18/8, 2012 at 2:25

That should be in the answer then. – Exile 18/8, 2012 at 2:55

The general reason to check for self-assignment is because you destroy your own data before copying in the new one. This assignment operator structure is also not strongly exception safe.

As an addendum, it was determined that self-assignment does not benefit performance at all, because the comparison must run every time but self-assignment is extremely rare, and if it does occur, this is a logical error in your program (really). This means that over the course of the program, it's just a waste of cycles.

Englert answered 18/8, 2012 at 3:4 Comment(7)

"this is a logical error in your program (really)." really, it isn't. – Houppelande 18/8, 2012 at 3:36

Self-assignment is a meaningless statement. Meaningless statements are never intentionally written by programmers; and therefore always errors. – Englert 18/8, 2012 at 3:53

No, self assignment should have no effect. It is not meaningless. – Houppelande 18/8, 2012 at 4:21

Having no effect is by definition meaningless, as the program is identical before and after it's execution. – Englert 18/8, 2012 at 4:35

Self-assignments aren't always obvious. It's totally reasonable to assume that many STL algorithms (partitioning, sorting and such) will do self-assignments on the elements. – Homeopathy 18/8, 2012 at 7:59

A logical error. Unless, of course, you have machine-generated source code. Oops. – Robertson 21/8, 2014 at 0:54

@Englert - a self-assignment is never an error if a design allows it to create a readable algorithm uncluttered by self-checks at the point the caller invokes the assignment operator. This is particularly true when the algorithm is performing "bulk" operations. The corollary of this design choice is that the assignment implementations must ensure they are safe. In essence, the obligation to prevent or allow (survive) self-checks is hidden from the caller (for the sake of providing a simplified API.) T – Imago 21/7, 2022 at 18:16

A question that's of greater importance in this case would be:

"What does it mean, when a you write function in a way, that requires you to check for self assignment???"

That implementation would be considered "bad" since it obviously fails in case of self-assignment.

Recommended topics

Hot tags