Is it safe to use `vec.push_back(vec.back());`? [duplicate]
Asked Answered
J

3

9

I noticed that the parameter of vector<T>::push_back is const T&. I know that a reallocation may take place when I call push_back, but vector<T>::back returns a reference. So is it safe?

std::vector<std::string> s{"some string"s};
s.push_back(s.back());
Joyce answered 7/9 at 9:18 Comment(4)
The strong exception guarantee says that if the insert fails (by an exception), the function has no effect. This implies that the vector cannot reallocate first, and fail to copy later.Tripinnate
@Tripinnate What paragraph in the standard is that? I'd like to read up on it, but I couldn't find it.Schoolgirl
@Ted - I don't have a paragraph for push_back, but the committee is clear that s.emplace(s.begin(), s.back()) must work.Tripinnate
@BoP, then I guess Howard Hinnant could have the final word here.Pycnidium
I
7

Yes, it is safe, assuming that the vector is not initially empty. push_back must be implemented in such a way that it works, because there is no precondition placed on push_back(const T&) to a different effect.

However, it is important to note that

s.push_back(std::move(s.back()));

would not be safe. The standard library may assume that a rvalue reference parameter of a library call always is the only reference to the object it is bound to. (This isn't specific to push_back, but stated in generality in the library specification.)


The standard library implementation can (and does) make this work by copy-constructing the new element first into the new allocation before relocating the old elements of the vector.

See e.g. here for libc++, here for Microsoft's implementation or here for libstdc++'s implementation.

There is no problem with exception guarantees. If the copy constructor for the new element throws, then the old allocation is still in place with all of its original contents.

Inscrutable answered 7/9 at 9:59 Comment(14)
It must be? I can't locate this verbiage myself, so I'd be obliged if you could point to it.Banff
@StoryTeller-UnslanderMonica I added a reason. There is no precondition saying anything else. Hence, it must be supported.Inscrutable
The most trivial implementation required to provide the strong exception guarantee also implies that the copy constructor most have succeeded before elements may be moved out of the old buffer. The fallacy here is though that if the copy constructor was also noexcept, then this order of events wasn't guaranteed in the first place!Keddah
Both overloads are lumped here with no precondition timsong-cpp.github.io/cppwp/n4950/vector#modifiers - I don't follow your argumentBanff
@StoryTeller-UnslanderMonica The rvalue reference situation is handled in generality for the whole library in eel.is/c++draft/library#res.on.arguments-1.3.Inscrutable
@Keddah I don't follow what how this relates to my answer.Inscrutable
@Inscrutable the only reason the current standard implementations are doing it in this order is due to the strong exception guarantees in a specific combinations of exception specifications encouraging this order. Nothing prohibits a different order of events. But passing a reference to a function that invalidates it is still UB after all, and the fact that it (currently) works is just empiric, but not backed by the specification.Keddah
@Keddah "But passing a reference to a function that invalidates it is still UB after all": Why should that be the case? The library functions are specified in terms of preconditions, i.e. valid initial states, and then certain effects or postconditions determining resulting states. In the initial state the reference is not invalidated and if the function is specified to invalidate references, then this ought to describe the final state, not any intermediate one. It wouldn't even be clear how the preconditions should interact with these unspecified intermediate states.Inscrutable
@Inscrutable right, the interaction with intermediate states is undefined. But that's actually what's biting you here, you are making assumptions about the precise intermediate state at the time the copy constructor is invoked, even though that was implementation defined. All you do know: Reallocation occurs as a side effect, so your references break. And the condition for a signature that takes a reference is, that unless specified otherwise, the reference remains valid until the invocation returns.Keddah
There's a requirement for ALL arguments to any C++ function anywhere, that unless given permission otherwise, arguments must remain valid for the duration of the call. To claim that reference validity only applies at the moment before the call opens the door for user-provided functions (like swaps, move constructors, hash functions) to have access to the same object (maybe it's a global) and do horrible horrible things to that object in the middle of the standard library trying to do its work.Niche
@BenVoigt Where is that requirement supposed to be? The standard currently has no wording to that effect that I am aware of. For what you describe as a problem, sure I agree and the standard should probably have clearer requirements to avoid this, but it shouldn't apply to the intrinsic behavior of standard library functions themselves, only to user-supplied functions. An intrinsic precondition on a library function should be stated in its specification. One shouldn't have to recursively consider effects and postconditions of library functions to determine their actual preconditions.Inscrutable
@BenVoigt And any specification issue aside, it seems pretty clear to me that the intent is to allow OP's code. Every container member function that doesn't permit references/iterators into the container itself has this stated as precondition explicitly. Discussion of LWG issues and elsewhere by the committee also makes it clear that, absent specific preconditions stating otherwise, the library is supposed to accommodate these use cases.Inscrutable
@user17732522: It's stated right in eel.is/c++draft/library#res.on.arguments-1.1 That requirement is not merely a precondition, it doesn't go away in the middle of the function call. (It establishes a precondition of course, since it does apply at the beginning of the call. But it is more than a precondition, and applies throughout the call)Niche
@BenVoigt The value of an argument can't change. And it doesn't make much sense applied to a reference at all, since they do not have a value. Also that is more than vague (e.g. "intended use"). I am not sure what exactly the clause is supposed to prohibit. I remain of the opinion that the standard is not intended to blanket prohibit passing references into library objects to library member functions, even if the function may destroy the referenced object during its execution.Inscrutable
K
0

It's safe if you did a s.reserve(s.size() + 1) before so that a reallocation is guaranteed not to happen.

It's undefined behavior if you don't know the current capacity and thus can't control if there will be a relocation and therefore also iterator as well as reference and value invalidation at an unspecified point in time.

Why this is a problem, can be simply traced down to the lifecycle of the object referenced by s.back(). Unless specified explicitly otherwise, you must ensure that the lifecycle of an object exceeds that of the last used reference. You did not get any guarantees beyond that, and therefore you were required to keep the argument of s.push_back(const T&) alive until push_back returns control to you, and you can not make any assumption about when or why push_back would access the object.

There are edge cases (such as: non-trivial but noexcept move constructor, throwing copy constructor) where other constraints such as the strong exception guarantees don't permit a use-after-free due to the specific order of events in the implementation, but those constraints are easily lifted by as little as e.g. a noexcept copy constructor which permits the worst case of move before copy again. Also, that's still all implementation defined.

s.push_back(auto(s.back())); on the other hand is safe again (even though it's using another signature, and is using both move and copy constructor with a temporary).

s.push_back(std::ref(auto(s.back()))); ensures that the arguments lifecycle is correct, even though you generally shouldn't use it, as it requires 2 invocations of the copy constructor.

Keddah answered 7/9 at 9:29 Comment(6)
Any source for the UB claim? I think the strong exception guarantee requires the reference to survive until copy ctor for n+1th element finishes in case it would throw. Might be UB for noexcept copy ctors? Nice workaround though, puts any validity questions to rest.Zephan
@Zephan it works in practical application for most implementations, due to how the strong exception guarantee is most trivially fulfilled by alloc,insert,move_range,dealloc. However, you are still passing an argument by reference that is invalidated in middle of the call which is UB after allKeddah
A more general form of the workaround would be s.push_back(auto(s.back())); - useful in generic code. Also, worth noting that the workaround ends up calling push_back(T&&) - so for a type like string we aren't pessimizing too much.Banff
When calling a function I'm responsible for the preconditions holding before the call. I cannot be held responsible for what the library does during the call. There are only pre-conditions and post-conditions.Tripinnate
Also, if you check emplace it specifically mentions that the value can be part of the container. (Though in a "non-normative" comment :-)Tripinnate
@Tripinnate yes, "non-normative". It wouldn't cause any real issues to make it normative (only undesirable impact is inevitably requiring an additional, impossible to eliminate temporary for insertion/emplace at middle point). But right now, the observable state of the container or element hold by the container at the time of insertion is yet unspecified. The implementations are already acting as if it had been normative.Keddah
S
-2

24.2.2.2 Containers [container.reqmts]

  1. Unless otherwise specified (either explicitly or by defining a function in terms of other functions), invoking a container member function or passing a container as an argument to a library function shall not invalidate iterators to, or change the values of, objects within that container.

It would be a mistake to think that since it's not allowed to invalidate iterators to objects within the container it can not invalidate references to objects within the container.

It's however easy to create an iterator type that would not be invalidated while a reference would be:

struct iterator {
    T& operator*() { return the_vector[the_index]; }

    std::vector<T>& the_vector;
    std::size_t the_index;
};

No sane implementation does this, but the standard does not forbid it.

The construct s.push_back(s.back()); therefore has undefined behavior.

Schoolgirl answered 7/9 at 9:59 Comment(9)
good point-> summary: Always check the specific rules for the container and operation you are working with to avoid undefined behavior.Copolymer
I don't see how you get to the UB conclusion or why the quote is relevant?Inscrutable
@Inscrutable The quote is what restrictions there are on container member functions with regards to invalidations. I mentioned it because it could be misinterpreted, but you are correct in that is has no bearing on reference invalidations since that behavior is not defined by the standard. Hence, UB.Schoolgirl
@TedLyngmo "or change the values of, objects within that container" implies that references also can't be invalidated unless stated otherwise. If the value of an object in the container is unchanged, then a reference to it can still be used.Inscrutable
@Inscrutable I don't think moving an object counts as changing its value. Do you?Schoolgirl
@TedLyngmo "Moving" in the sense of a move constructor/assignment must be assumed to change the value of the original object. If you mean "move" in the sense of relocation into a new allocation via copy construction, then one might argue that the destructor of the original object may not actually change the value, but then that would be a defect to not include ending the lifetime in the quoted wording.Inscrutable
@Inscrutable You are proposing that "invalidate iterators to, or" is redundant in the sentence "shall not invalidate iterators to, or change the values of, objects". I do not think that it is.Schoolgirl
This line of reasoning (about whether references can be invalidated when iterators aren't) is not applicable to the stated situation where iterators ARE invalidated.Niche
@BenVoigt I mentioned it because it could be misinterpreted, but you are correct in that is has no bearing on reference invalidations since that behavior is not defined by the standard. Hence, UBSchoolgirl

© 2022 - 2024 — McMap. All rights reserved.