When should I return by value, as opposed to returning a unique pointer
Asked Answered
T

3

9

What I'm wondering is, how does returning by value a Cat actually differ from returning an std::unique_ptr<Cat> in terms of passing them around, memory management and using them in practice.

Memory management wise, aren't they the same? As both a returned by value object and an object wrapped in a unique_ptr will have their destructors triggered once they go out of scope?

So, how would you compare both pieces of code:

Cat catFactory(string catName) {
    return Cat(catName);
}

std::unique_ptr<Cat> catFactory(string catName) {
    return std::unique_ptr(new Cat(catName));
}
Teahouse answered 8/6, 2017 at 23:58 Comment(3)
Value semantics are way easier to understand and read. If Cat can be move-constructed the is no reason to make a pointer. Also, unique_ptr cannot be copied.Designer
To add, unique_ptr values can be released from the pointer and used as a dynamically allocated object. Value is generally they way to go as @HenriMenke pointed out, however using a unique_ptr allows you to extract the object for use beyond the scope of the ptr.Wexler
Thanks for the input, guys! Yeah, not implementing move constructor and move assignment could impose significant overhead of unnecessarily copying the object. uniq_ptr's release api also seems like a legit differentiatorTeahouse
K
17

Returning by value should be considered the default. (*) Deviating from the default practice, by returning std::unique_ptr<Cat>, should require justification.

There are three main reasons to return a pointer:

  1. Polymorphism. This is the best reason to return std::unique_ptr<Cat> instead of Cat: that you might actually be creating an object of a type derived from Cat. If you need this polymorphism, you absolutely need to return a pointer of some sort. This is why factory functions usually return pointers.

  2. Cat cannot be moved cheaply or cannot be moved at all. "Inherently" unmovable types are rare; you should usually try to fix Cat by making it cheaply movable. But of course Cat could be a type owned by someone else, to which you cannot add a move constructor (or perhaps even a copy constructor). In that case, there is not much you can do other than use unique_ptr (and complain to the owner).

  3. The function has the potential to fail and be unable to construct any valid Cat. In that case, one possibility is return by value anyway but throw an exception if the Cat cannot be constructed; the other, in C++11/C++14, is to make the function return std::unique_ptr<Cat> and have it return a null pointer when no Cat can be constructed. In C++17, however, you should start returning std::optional<Cat> instead of std::unique_ptr<Cat> in that case, to avoid unnecessary heap allocation.

(*) This also applies to passing objects when the function being called needs its own copy of the value, e.g., a constructor that will initialize a class member from one of its arguments. Accept the object by value and move.

Krein answered 9/6, 2017 at 0:8 Comment(4)
Just to note, "Both passing and returning by value should be considered the default" isn't really true. Passing by const reference is often cheaper and safer than by value.Wexler
@Wexler sorry, you're right, I meant to say passing by value in the case where the function needs its own copy of the value. I'll edit.Krein
@Brian I think this answers my question really well (as far as I can tell, given that I don't know what I don't know :D). So in general, when designing some kind of API for a project maintained by a team of programmers, do you think I should stick to std::unique_ptr given the reasons you've mentioned?Teahouse
@user1113314: That sounds like the opposite of what this post says. In general, when designing some kind of API, you should use the general mechanism: return by value, unless returning by value is a very bad idea.Sourdine
Z
5

Generally, return by value.

Exceptions to this rule:

  1. The Cat needs to exist on the heap, so as to outlast its creating code... but in this case perhaps it shouldn't really be a unique_ptr that's returned, but rather a shared_ptr.
  2. You're not actually constructing the Cat but rather getting access to something which can be interpreted as a Cat; in this case, again, you probably don't want a unique pointer but a regular one (or a unique pointer with a custom deleter).
  3. Polymorphism - if it's a factory and Cat is one of its products, you can probably also make a Dog and a Horse, all being Animals, so you'll return a pointer to a base class, e.g. a unique_ptr<Animal>. That's definitely a case where you would use a unique pointer.
  4. Dark voodoo in your copy, assignment and/or move constructors which makes it important to always make sure you only poke your Cat from afar.

I disagree with @Brian's answer regarding two of the exceptions he suggests:

  • I would suggest not to use a pointer return type so as to be able to indicate failure by returning nullptr. Failing to return a valid value is what exceptions are for, and even if you want to avoid them - I'd suggest returning an std::expected (with C++23) or std::optional (earlier C++ versions). Or just throwing an exception on failure if you're allowed to do that.
  • You usually don't need a move constructor for return value optimization to kick in - so the lack of a move constructor should not be a reason to return a pointer.
Zillion answered 9/6, 2017 at 0:32 Comment(3)
Thank you for your perspective. Could you just give an example of what you mean by point 2.? And I'm too wondering about the compiler move optimization, if we don't need to implement move semantics for the optimization, why have them in the first place? Maybe @Brian could fill us in?Teahouse
@user1113314: move semantics are useful for passing parameters around, and using return value from functions, aside from return value optimization. Return value optimization doesn't require move constructors, because return value optimization bypasses the move.Sourdine
@user1113314: Sometimes you can't say Cat result{}; even within your factory. Maybe you, I dunno, only deserialize some byte stream and you only know at run time this constitutes a cat.Zillion
H
0

Memory-management wise they are completely different.

Sure, these days the actual functional difference between these trivial examples is pretty slim, assuming move semantics can make that by-value return cheap (in the second example they're responsible for moving the pointer, instead). And, sure, both objects will be destroyed at the same time if you just let everything go out of scope right away.

But the code is far less simple with the dynamic allocation, and adds a "why?" factor.

You can't really rationalise about the difference any further without inspecting how the result is going to be used after the function returns. Then all the typical considerations about automatic vs dynamic memory allocation come back into play.

In conclusion, there's really no generic, catch-all way to tell you whether a factory should dynamically-allocate or return by value. However, personally I would prefer the latter for simplicity (unless you know that you can't), especially if your object types are typically moveable (which likely won't have much effect in the function itself due to RVO, but may help you at the callsite).

Hostel answered 9/6, 2017 at 0:8 Comment(2)
I appreciate the effort you've put into writing the answer but at the same time I don't think you're actually saying much with it. I would appreciate if you could provide some kind of examples that would help discriminate the two options in some way.Teahouse
@user1113314: Showing you the difference between dynamic and non-dynamic allocation is way out of scope of a Stack Overflow question. Turn to the relevant pages in your C++ book, instead.Hostel

© 2022 - 2024 — McMap. All rights reserved.