Does the standard state that copies must be equivalent?
Asked Answered
M

2

10

Suppose I have a weird string type, that either owns or doesn't own it's underlying buffer:

class WeirdString {
private:
    char* buffer;
    size_t length;
    size_t capacity;
    bool owns;

public:
    // Non-owning constructor
    WeirdString(char* buffer, size_t length, size_t capacity)
        : buffer(buffer), length(length), capacity(capacity), owns(false)
    { }

    // Make an owning copy
    WeirdString(WeirdString const& rhs)
        : buffer(new char[rhs.capacity])
        , length(rhs.length)
        , capacity(rhs.capacity)
        , owns(true)
    {
        memcpy(buffer, rhs.buffer, length);
    }

    ~WeirdString() {
        if (owns) delete [] buffer;
    }
};

Does that copy constructor violate the standard somewhere? Consider:

WeirdString get(); // this returns non-owning string
const auto s = WeirdString(get());

s is either owning or non-owning depending on whether or not the additional copy constructor got elided, which in C++14 and earlier is permitted but optional (though in C++17 is guaranteed). That Schrödinger's ownership model suggests that this copy constructor is, in itself, undefined behavior.

Is it?


A more illustrative example might be:

struct X {
    int i;

    X(int i)
      : i(i)
    { }

    X(X const& rhs)
      : i(rhs.i + 1)
    { }        ~~~~
};

X getX();
const auto x = X(getX());

Depending on which copies get elided, x.i could be 0, 1, or 2 more than whatever was returned in getX(). Does the standard say anything about this?

Munich answered 23/1, 2017 at 22:47 Comment(14)
Might be clearer if you posted the code for get().Huntlee
From a C++ perspective, Schrödinger's cat is merely in an unspecified state. You don't get Undefined Behavior merely because you don't know the exact state from a set of well-defined possible states.Buttock
In f() + g(), it is unspecified whether f or g get called first; this is not, by itself, a reason to declare that the expression exhibits undefined behavior. It's possible, of course, that g somehow relies on a side effect produced by f, and exhibits undefined behavior in its absence. Yours is a similar situation: copy constructor may or may not be elided, and you may end up with owning or non-owning instance - that by itself does not trigger undefined behavior; but it's possible that something further down relies on the instance being in a particular state, and gets disappointed.Aeniah
@IgorTandetnik Yes, but we have explicit wording about how that situation would be undefined behavior. I find it strange that there is seemingly no wording about what a copy constructor is supposed to do.Munich
What wording do you have in mind, about what situation being undefined behavior? I'm not sure I follow. Anyway, what would you have the standard say about the copy constructor? It would be a challenge to define what it means for two instances of an arbitrary class to be "equivalent"?Aeniah
As I recall the rules are changing with C++17, in that elision is required. That will make the X example well-behaved.Buttonhole
@IgorTandetnik To start with, I'd have expected it to at least specify that the two instances should be equivalent, even if it's handwavy about what that means.Munich
What good would that be? If the standard cannot state the requirement precisely, then the programmer cannot verify whether their code meets that requirement. In any case, why does the behavior of the copy constructor, specifically, bother you so much? The non-deterministic abstract machine described by the standard is non-deterministic - this can be easily triggered by things other than copy elision.Aeniah
@Igor Well, that's what it does for the library. vector<X> would be undefined behavior because X is not CopyConstructible. (assume X had a move ctor that similarly did something odd)Munich
CopyConstructible only means that the class provides a copy constructor. It doesn't mandate any particular behavior of said constructor. Both classes you show satisfy CopyConstructible requirement, and you can happily have a vector thereof. I'm not sure where you see a source of undefined behavior.Aeniah
@Igor No, it requires that the new object be "equivalent" to the old object and that the old object be unchanged. Neither type satisfies equivalence.Munich
Hmm, so it does. I have no idea what it means though; I can't find where "equivalent" is defined. Therefore, I don't see how one can decide whether two instances of WeirdString or X are or are not "equivalent" for the purposes of these requirements. I would argue it's a defect in the standard.Aeniah
MyClass a; MyClass b = a; if (a != b) cout << "HELP";Soniasonic
@IgorTandetnik timsong-cpp.github.io/lwg-issues/1173Bilabiate
B
5

Regarding the new question's code

struct X {
    int i;

    X(int i)
      : i(i)
    { }

    X(X const& rhs)
      : i(rhs.i + 1)
    { }        ~~~~
};

X getX();
const auto x = X(getX());

Here the copy constructor doesn't copy, so you're breaking the compiler's assumption that it does.

With C++17 I believe you're guaranteed that it's not invoked in the above example. However I don't have a draft of C++17 at hand.

With C++14 and earlier it's up to the compiler whether the copy constructor is invoked for the call of getX, and whether it's invoked for the copy initialization.

C++14 §12.8/31 class.copy/31:

When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the constructor selected for the copy/move operation and/or the destructor for the object have side effects.

This is not undefined behavior in the sense of the formal meaning of that term, where it can invoke nasal demons. For the formal terminology I'd choose unspecified behavior, because that's behavior that depends on the implementation and not required to be documented. But as I see it what name one chooses doesn't really matter: what matters is that the standard just says that under the specified conditions a compiler can optimize a copy/move construction, regardless of the side effects of the optimized-away constructor – which you therefore can not and should not rely on.

Buttonhole answered 23/1, 2017 at 23:37 Comment(1)
For C++17, see P0135, which is voted into the WP (the latest is N4618, which you can google).Lek
B
4

The part of the question about a class X was added after this answer. It's fundamentally different in that X copy constructor does not copy. I've therefore answered that separately.

Regarding the original question's WeirdString: it's your class so the standard places no requirements on it.

However, the standard effectively let compilers assume that a copy constructor copies, and nothing else.

Happily that's what your copy constructor does, but if (I know this doesn't apply to you, but if) it had mainly had some other effect, that you relied on, then the copy elision rules could wreak havoc with your expectations.

Where you'd want a guaranteed owning instance (e.g. in order to pass it to a thread) you can simply provide an unshare member function, or a constructor with a tag argument, or a factory function.

You can generally not rely on a copy constructor being invoked.


To avoid problems you'd better take care of all possible copying, which means also the copy assignment operator, operator=.

Otherwise you risk that two or more instances all think they own the buffer, and are responsible for deallocation.

It's also a good idea to support move semantics by defining a move a constructor and declaring or defining a move assignment operator.

You can be more sure of correctness of all this by using a std::unique_ptr<char[]> to hold the buffer pointer.

Among other things that prevents inadvertent copying via a copy assignment operator.

Buttonhole answered 23/1, 2017 at 22:51 Comment(4)
std::string is irrelevant, and the elision of the copy would wreak havoc if I needed an owning string. The question is specifically about if there are any requirements placed on the copy constructor for copy elision to be valid.Munich
Sorry for a slight little binary inversion. Hm. I can see where you'd want a guaranteed owning instance to pass to a thread. One way to do that is to simply provide an unshare member function, or a constructor with a tag argument, or a factory function.Buttonhole
Do you mind just deleting everything after the first hr? It's unrelated to the question and distracting. The question isn't about how to properly implement a string, it's about the implications of having a not-quite-copy constructor (which the first part of your answer addresses).Munich
@Barry: OK, I guessed wrong about what you were doing this for. So, deleting the middle section. I think the last one, about taking charge of copying in general, is relevant still; isn't it?Buttonhole

© 2022 - 2024 — McMap. All rights reserved.