Why unique_ptr and shared_ptr do not invalidate the pointer they are constructed from?
Asked Answered
T

2

4

A note: this is an API design question, riding on the design of the constructors of unique_ptr and share_ptr for the sake of the question, but not aiming to propose any change to their current specifications.


Though it would usually be advisable to use make_unique and make_shared, both unique_ptr and shared_ptr can be constructed from a raw pointer.

Both get the pointer by value and copy it. Both allow (i.e. in the sense of: do not prevent) a continuance usage of the original pointer passed to them in the constructor.

The following code compiles and results with double free:

int* ptr = new int(9);
std::unique_ptr<int> p { ptr };
// we forgot that ptr is already being managed
delete ptr;

Both unique_ptr and shared_ptr could prevent the above if their relevant constructors would expect to get the raw pointer as an rvalue, e.g. for unique_ptr:

template<typename T>
class unique_ptr {
  T* ptr;
public:
  unique_ptr(T*&& p) : ptr{p} {
    p = nullptr; // invalidate the original pointer passed
  }
  // ...

Thus, the original code would not compile as an lvalue cannot bind to an rvalue, but using std::move the code compiles, while being more verbose and more secured:

int* ptr = new int(9);
std::unique_ptr<int> p { std::move(ptr) };
if (!ptr) {
  // we are here, since ptr was invalidated
}

It is clear that there can be dozens of other bugs a user can do with smart pointers. The commonly used argument of you should know how to properly use the tools provided by the language, and C++ is not designed to watch over you etc.

But still, it seems that there could have been an option for preventing this simple bug and to encourage usage of make_shared and make_unique. And even before make_unique was added in C++14, there is still always the option of direct allocation without a pointer variable, as:

auto ptr = std::unique_ptr<int>(new int(7));

It seems that requesting rvalue reference to a pointer as the constructor parameter could add a bit of an extra safety. Moreover, the semantics of getting rvalue seems to be more accurate as we take ownership of the pointer that is passed.

Coming to the question of why didn't the standard take this more secured approach?


A possible reason might be that the approach suggested above would prevent creating a unique_ptr from const pointers, i.e. the following code would fail to compile with the proposed approach:

int* const ptr = new int(9);
auto p = std::unique_ptr { std::move(ptr) }; // cannot bind `const rvalue` to `rvalue`

But this seems to be a rare scenario worth neglecting, I believe.

Alternatively, in case the need to support initialization from a const pointer is a strong argument against the proposed approach, then a smaller step could still be achieved with:

unique_ptr(T* const&& p) : ptr{p} {
    // ...without invalidating p, but still better semantics?
}
Tab answered 4/3, 2020 at 21:33 Comment(17)
Because you should not use new at all, use std::make_unique or std::make_sharedChao
The general case is that there are other pointers with equal values. How do you set those to 'nullptr'?Nationality
@Nationality this would not solve a problem with other pointers that point to the same address. In theory this is the general case, in practice it is far from being the common case. Anyhow the semantics of requiring the user to call std::move in case passing a pointer variable makes it much more verbose that there is a pass of ownershipTab
@Chao I'd say that the proposed approach would have encouraged even more the use of std::make_unique or std::make_shared, but since there is already a constructor that allows creation from a raw pointer maybe it should have been defined differently - which is what the question is aboutTab
The common case is p ceases to exist right afterNationality
Maybe Howard Hinnant could weight in directly if he happens to see this?Lovato
You can read about the first mention of unique_ptr (at the time called move_ptr) here N1377Lovato
For what it's worth (i.e., not much) my guess is either "we didn't think of that", or "we wanted to make it as easy as possible to convert all of your auto_ptr to unique_ptr"Lovato
p = nullptr is setting p to a known value, not invalidation. It is still UB to delete an invalidated pointerVikki
"Don't pay for what you don't use" principle, setting old pointer to 0 is a wasted cycleVikki
@Vikki then you can still go for the 2nd version that gets T* const&& p and skip the invalidation, just to better express the move of ownership semanticsTab
@AmirKirsh it's not "move of ownership", it's "assignment of ownership". A plain pointer by itself owns nothing.Bibcock
@Kit I tend to agree. This can be a good answer as it is I think. Without the need to argue on the "more secure" part.Tab
Referencing the owned object via a copy of the raw pointer was never viewed as being wrong, or really even dangerous. Sometimes it is even beneficial: herbsutter.com/2013/06/05/…Claudetteclaudia
@HowardHinnant agreed, but that's not the question. One could still get the copy of the raw pointer by calling get() which I argue is more verbose than using the original pointer. It just seems to me that semantically passing a raw pointer to unique_ptr should have been saying that this raw pointer cannot (or shouldn't) be used directly anymore, unless retrieved back from the unique_ptr via get. Such a change cannot be done now of course, but the question is about API design, i.e. what are the reasons it was not done like that.Tab
unique_ptr was designed to be as close to a drop in replacement for auto_ptr as possible but without the "dangerous parts" that auto_ptr had (move with copy syntax). open-std.org/jtc1/sc22/wg21/docs/papers/2005/…Claudetteclaudia
@HowardHinnant auto_ptr couldn't of course be implemented with that approach. By the way, going with unique_ptr(T* const&& p) : ptr{p} could have been a nice example for rvalue reference to const.Tab
F
1

I think the answer is simple: zero overhead. This change isn't needed for unique_ptr to be functional, so the standard doesn't require it. If you think this improves safety enough to worth it, you can ask your implementation to add it (maybe under a special compilation flag).

BTW, I'd expect static analyzers to know enough to warn against this code pattern.

Faze answered 8/3, 2020 at 10:57 Comment(1)
The proposed version unique_ptr(T* const&& p) : ptr{p} without invalidating the original pointer, is still zero overhead but adds some degree of safety, not allowing to pass lvalue pointer. Anyhow this is not a proposal for changing current implementation but rather an API design question. It seems that this could have been a good usage of const rvalue reference.Tab
B
4

As long as they both have a get() method, invalidating the original pointer is not a "more secured", but a more confusing approach.

In the way raw pointers are used in C++, they don't represent any ownership concept by themselves and don't define the object lifetime. Anyone who uses raw pointers in C++ needs to be aware that they are only valid while the object exists (whether the object lifetime is enforced by smart pointers or by the logic of the program). Creating a smart pointer from a raw pointer does not "transfer" ownership of the object, but "assigns" it.

This distinction could be made clear in the following example:

std::unique_ptr<int> ptr1 = std::make_unique<int>(1);
int* ptr2 = ptr1.get();
// ...
// somewhere later:
std::unique_ptr<int> ptr3(ptr2);
// or std::unique_ptr<int> ptr3(std::move(ptr2));

In this case, the ownership of the int object is not transferred to the ptr3. It is erroneously assigned to it without releasing the ownership by ptr1. A programmer needs to be aware where the pointer passed to the std::unique_ptr constructor comes from, and how the ownership of the object has been enforced so far. An assurance by the library that it will invalidate this particular pointer variable may give the programmer a false sense of security, but does not provide any real safety.

Bibcock answered 4/3, 2020 at 21:47 Comment(2)
And even if they didn't have get, there's always &*ptr and ptr::operator->Nationality
So this is the argument of: users have all sorts of ways to write bugs, they should know how to properly use the tools provided by the language, C++ is not designed to watch over youTab
F
1

I think the answer is simple: zero overhead. This change isn't needed for unique_ptr to be functional, so the standard doesn't require it. If you think this improves safety enough to worth it, you can ask your implementation to add it (maybe under a special compilation flag).

BTW, I'd expect static analyzers to know enough to warn against this code pattern.

Faze answered 8/3, 2020 at 10:57 Comment(1)
The proposed version unique_ptr(T* const&& p) : ptr{p} without invalidating the original pointer, is still zero overhead but adds some degree of safety, not allowing to pass lvalue pointer. Anyhow this is not a proposal for changing current implementation but rather an API design question. It seems that this could have been a good usage of const rvalue reference.Tab

© 2022 - 2024 — McMap. All rights reserved.