C++ - passing references to std::shared_ptr or boost::shared_ptr
Asked Answered
W

17

116

If I have a function that needs to work with a shared_ptr, wouldn't it be more efficient to pass it a reference to it (so to avoid copying the shared_ptr object)? What are the possible bad side effects? I envision two possible cases:

1) inside the function a copy is made of the argument, like in

ClassA::take_copy_of_sp(boost::shared_ptr<foo> &sp)  
{  
     ...  
     m_sp_member=sp; //This will copy the object, incrementing refcount  
     ...  
}  

2) inside the function the argument is only used, like in

Class::only_work_with_sp(boost::shared_ptr<foo> &sp) //Again, no copy here  
{    
    ...  
    sp->do_something();  
    ...  
}  

I can't see in both cases a good reason to pass the boost::shared_ptr<foo> by value instead of by reference. Passing by value would only "temporarily" increment the reference count due to the copying, and then decrement it when exiting the function scope. Am I overlooking something?

Just to clarify, after reading several answers: I perfectly agree on the premature-optimization concerns, and I always try to first-profile-then-work-on-the-hotspots. My question was more from a purely technical code-point-of-view, if you know what I mean.

Welladvised answered 29/11, 2008 at 14:27 Comment(1)
I don't know if you can modify the tags of your question, but please try to add a boost tag in there. I tried looking for this question but I couldn't find any because I looked for boost and smart-pointer tags. So I found your question just after composing my own questionBibliotherapy
E
115

The point of a distinct shared_ptr instance is to guarantee (as far as possible) that as long as this shared_ptr is in scope, the object it points to will still exist, because its reference count will be at least 1.

Class::only_work_with_sp(boost::shared_ptr<foo> sp)
{
    // sp points to an object that cannot be destroyed during this function
}

So by using a reference to a shared_ptr, you disable that guarantee. So in your second case:

Class::only_work_with_sp(boost::shared_ptr<foo> &sp) //Again, no copy here  
{    
    ...  
    sp->do_something();  
    ...  
}

How do you know that sp->do_something() will not blow up due to a null pointer?

It all depends what is in those '...' sections of the code. What if you call something during the first '...' that has the side-effect (somewhere in another part of the code) of clearing a shared_ptr to that same object? And what if it happens to be the only remaining distinct shared_ptr to that object? Bye bye object, just where you're about to try and use it.

So there are two ways to answer that question:

  1. Examine the source of your entire program very carefully until you are sure the object won't die during the function body.

  2. Change the parameter back to be a distinct object instead of a reference.

General bit of advice that applies here: don't bother making risky changes to your code for the sake of performance until you've timed your product in a realistic situation in a profiler and conclusively measured that the change you want to make will make a significant difference to performance.

Update for commenter JQ

Here's a contrived example. It's deliberately simple, so the mistake will be obvious. In real examples, the mistake is not so obvious because it is hidden in layers of real detail.

We have a function that will send a message somewhere. It may be a large message so rather than using a std::string that likely gets copied as it is passed around to multiple places, we use a shared_ptr to a string:

void send_message(std::shared_ptr<std::string> msg)
{
    std::cout << (*msg.get()) << std::endl;
}

(We just "send" it to the console for this example).

Now we want to add a facility to remember the previous message. We want the following behaviour: a variable must exist that contains the most recently sent message, but while a message is currently being sent then there must be no previous message (the variable should be reset before sending). So we declare the new variable:

std::shared_ptr<std::string> previous_message;

Then we amend our function according to the rules we specified:

void send_message(std::shared_ptr<std::string> msg)
{
    previous_message = 0;
    std::cout << *msg << std::endl;
    previous_message = msg;
}

So, before we start sending we discard the current previous message, and then after the send is complete we can store the new previous message. All good. Here's some test code:

send_message(std::shared_ptr<std::string>(new std::string("Hi")));
send_message(previous_message);

And as expected, this prints Hi! twice.

Now along comes Mr Maintainer, who looks at the code and thinks: Hey, that parameter to send_message is a shared_ptr:

void send_message(std::shared_ptr<std::string> msg)

Obviously that can be changed to:

void send_message(const std::shared_ptr<std::string> &msg)

Think of the performance enhancement this will bring! (Never mind that we're about to send a typically large message over some channel, so the performance enhancement will be so small as to be unmeasureable).

But the real problem is that now the test code will exhibit undefined behaviour (in Visual C++ 2010 debug builds, it crashes).

Mr Maintainer is surprised by this, but adds a defensive check to send_message in an attempt to stop the problem happening:

void send_message(const std::shared_ptr<std::string> &msg)
{
    if (msg == 0)
        return;

But of course it still goes ahead and crashes, because msg is never null when send_message is called.

As I say, with all the code so close together in a trivial example, it's easy to find the mistake. But in real programs, with more complex relationships between mutable objects that hold pointers to each other, it is easy to make the mistake, and hard to construct the necessary test cases to detect the mistake.

The easy solution, where you want a function to be able to rely on a shared_ptr continuing to be non-null throughout, is for the function to allocate its own true shared_ptr, rather than relying on a reference to an existing shared_ptr.

The downside is that copied a shared_ptr is not free: even "lock-free" implementations have to use an interlocked operation to honour threading guarantees. So there may be situations where a program can be significantly sped up by changing a shared_ptr into a shared_ptr &. But it this is not a change that can be safely made to all programs. It changes the logical meaning of the program.

Note that a similar bug would occur if we used std::string throughout instead of std::shared_ptr<std::string>, and instead of:

previous_message = 0;

to clear the message, we said:

previous_message.clear();

Then the symptom would be the accidental sending of an empty message, instead of undefined behaviour. The cost of an extra copy of a very large string may be a lot more significant than the cost of copying a shared_ptr, so the trade-off may be different.

Eyestalk answered 30/11, 2008 at 10:38 Comment(36)
The shared_ptr that is passed in already lives in a scope, at the call site. You might be able to create an elaborate scenario where the code in this question would blow up due to a dangling pointer, but then I suppose you have bigger problems than the reference parameter!Valuer
It may be stored in a member. You may call something that happens to clear that member. The whole point of smart_ptr is to avoid having to coordinate lifetimes in hierarchies or scopes that nest around the call stack, so that's why it's best to assume that lifetimes don't do that in such programs.Eyestalk
Earwicker: Your scenario is of course real, but it my personal experience not likely. Perhaps we have just been working in different code bases. :) Good thing is, this page contains both viewpoints now ;)Valuer
It's not really my viewpoint though! If you think what I'm saying is something specific to do with my code, you may not have understood me. I'm talking about an unavoidable implication of the reason shared_ptr exists in the first place: many object lifetimes are not simply related to function calls.Eyestalk
Magnus Hoff had the very CORRECT view. In Earwiker's 1st point, it's ridiculous, because if you can't be sure the object is there during function body, how can you be sure the object is there before it makes copy for the function ? In this case, the shared pointer you got inside the function body is not trust worthy -- nothing is trustworthy.Hispanic
@JQ - I've added a lengthy update to hopefully make the point more clearly.Eyestalk
Sorry, that should be @Hispanic with a dot.Eyestalk
The example is still a bit unclear. Why is there need to reset before calling a function? Perhaps it is a precondition of that function, that the previous value is destroyed and it invokes undefined behavior otherwise?Environs
@DanielEarwicker Hi Daniel, thanks for updating the comments for me. I took me looog time to understand around 50%(?) what you are trying to get at. But I found the example not convincing. Reason is: if the user is using send_message() and at the same time he can access to previous_message, he should know what the function does to the variable. If he's not aware, it's a bad design of the function itself.Hispanic
@Environs - perhaps in a real example, the cached object is not a string but some external resource encapsulated in an object, e.g. a file handle or some kind of connection managed by the OS, and we want to tear down the old one before we build a new one.Eyestalk
@JQ - welcome to the real world! :) It's full of APIs that make unnecessary use of mutable state variables. The C function strtok is a nasty old example. The point of the question was "What are the possible bad side-effects" of changing shared_ptr<T> into const shared_ptr<T> &. And in real code, where classes have mutable members, and some member functions depend on those members, and other member functions give access to them, and there are complex networks of objects that hold references to each other, then there are many opportunities for this kind of problem to occur.Eyestalk
So if you pass a member variable by const reference to a function that modifies that member variable before accessing the parameter, your parameter will be modified unexpectedly. This applies to anything, not just shared_ptr. Does this mean the rule of thumb should be "pass all objects by value rather than const reference if they are cheap to copy"?Marilynnmarimba
@Marilynnmarimba - that rule can't be right, because often you want to share references to objects by design, so the fact that it might be cheap to copy doesn't influence your decision. If you are passed a const T& can you assume there will be no changes to its state until your function exits? If you can see that your function never causes any T to be modified, then you're perfectly safe. If you function directly modifies some T (perhaps by calling some other function that does the modification) then yes, you may have a problem waiting to happen. (cont...)Eyestalk
However, the purpose of shared_ptr<T> v is to serve as a better T *v, and what is its specific advantage? That it stops an object from being destroyed prematurely. Change it again to shared_ptr<T> &v and you discard that advantage. The whole point of shared_ptr is that copying it is cheap enough for you to depend on it widely, so that you don't have to be responsible for analysing every aspect of your object's lifetimes. It's for situations where you need sharing by design, and where you have complex overlapping object lifetimes, so the analysis-by-hand would be fiendishly difficult.Eyestalk
It depends what you consider cheap. Copying a shared_ptr requires interlocked reference count modifications, which to me are relatively expensive. As such, I have the rule that shared_ptrs are always passed by const reference, and stored as copies. The only case where this could be problematic is where one is passed by member to a function that happens to reset the member and then expect the pointer to still be valid afterwards -- and that case is vanishingly unlikely and easy to fix by copying to local variable before passing in, which is often needed for concurrency anyway.Coleslaw
@Coleslaw - "and that case is vanishingly unlikely" - great, then there's no problem! :) Not sure I'd assign it that probability myself. Also I prefer to avoid bugs now rather than fix them later, so "ease of fixing" is good, but not as good as being correct in the first place. Finally (as stated above) I prefer to optimise based on measurement; anything that happens infrequently is cheap, anything that happens a lot is expensive.Eyestalk
Passing parameters usually happens quite frequently. Thus you must want it to be done as fast as possible. :) Copying should be done only in the cases where it is actually required, which in turn should be the minority of cases. But as always, it depends on the task at hand, and how much knowledge of the caller you have.Coleslaw
Mind you, it might be a corollary of another rule I try to follow that makes this usage safe -- shared_ptrs must always be returned by value, never by reference. Your example above gets in trouble because previous_message is accessed directly. The class that defines send_message should know that this is unsafe due to the specific implementation, and refrain. Outside code doesn't know the implementation, so may not realise it's unsafe, but they're protected because they can't access previous_message directly, they can only call get_previous_message, which inherently copies (return by value).Coleslaw
@Coleslaw - as I said, it's "a contrived example. It's deliberately simple, so the mistake will be obvious. In real examples, the mistake is not so obvious because it is hidden in layers of real detail." My major concern here is that people might think they can go into some existing well-tested production code, and "enhance" it by changing shared_ptr into a const shared_ptr &, under the mistaken belief that this won't alter the semantics. The necessary analysis across the entire code base, required to make such a change safely, may be fiendishly hard, depending on the complexity.Eyestalk
True, but it's never safe to make that sort of change without knowing what the knock-on effects are going to be. And that's partly what unit tests are for. :) That still doesn't mean that the default for new code should be "pass by value".Coleslaw
@Coleslaw - Absolutely. One possible interpretation of this question is "would this change have any knock-on effects?" Picking the default for new code is a more open issue. C++ is all about premature optimisation, after all! :) One of my rules is to play very safe until you are in a position to make real measurements about where optimisations are needed... but then, another such rule is that when you find you need shared_ptr, you're wandering outside of C++'s core purpose and into the domain of true GC'd languages.Eyestalk
The bug in the second example has nothing to do with shared_ptr. Replace the std::shared_ptr<std::string> argument with anything else -- an std::string, for example -- and the code still breaks by trashing the msg argument through the global previous_message. The only difference is that the std::shared_ptr<std::string> version will crash instead of giving bad output, and that's because send_message() isn't ensuring that the pointer is valid before using it (which it needs to do regardless of whether or not the shared_ptr is passed by reference).Quiescent
@JoshTownzen - see the last part of the answer, beginning "Note that a similar bug would occur if we used std::string throughout..." which says exactly the same thing. You may have missed the point, which is that a program may initially be absolutely correct, but then become absolutely incorrect if you change shared_ptr into const shared_ptr &.Eyestalk
Just a link to another SO answer that agrees that it is safe to pass a reference to a shared pointer. LINK: passing reference is okay, storing reference is not okayOpalopalesce
@TrevorBoydSmith - thanks for letting me know; I've added a comment to that answer explaining why it's wrong.Eyestalk
This answer does not make sense at all. It applies to situations with other variable types too as the writer added later in an edit. So it neither answers the question nor is a correct point of view. The logic of the code written in this post is itself DEEPLY FLAWED!Marginalia
@Tolga - because it applies to other situations with other variable types too, you conclude that it doesn't apply to this situation? That is flawed logic.Eyestalk
@DanielEarwicker completely agree with all your points and am surprised by the level of opposition. Something that makes your concerns even more relevant is threading, when this becomes involved guarantees about an objects validity become much more important. Good answer.Anastassia
Not long ago, I chased down a very serious bug that was due to passing a reference to a shared pointer. The code was handling an object's state change and when it noticed the state of the object had changed, it removed it from the collection of objects in the previous state and moved it into the collection of objects in the new state. The remove operation destroyed the last shared pointer to the object. The member function had been called on a reference to the shared pointer in the collection. Boom. Daniel Earwicker is right.Faery
The opposition is because people expect the common case to be passing a shared pointer from the local scope of the calling function, and indeed this is often safe to pass by reference. I think what Daniel's trying to say (what people often miss) is the risk of these assumptions being violated by people who maintain the code later. If you can safely assume that everyone who'll ever look at the code is an infallible expert then you can do more or less what you like, but years of coding in commercial environments tends to beat that sort of assumption out of most of us. (^_^)Squiggle
@Squiggle - I suspect most people reader advice like Sutter's, but not very carefully. The short version of his advice is Prefer passing objects by value, *, or &, not by smart pointer. But he adds the warning "Note this assumes the pointer is not aliased. You have to be careful if the smart pointer parameter could be aliased, but in this respect it’s no different than any other aliased object." That warning is exactly my answer! I just go into detail about what aliasing means and why it causes problems [cont...]Eyestalk
[cont...] But once you get that, then it becomes a question of whether you want to take the risk of discovering delightfully amusing and subtle aliasing bugs later, in exchange for "fast" code, or whether you'd prefer stronger guarantees of correctness so you can then measure and invest in optimisation effort (with all the care to avoid aliasing bugs that this entails) where it's really necessary. And I strongly recommend the latter.Eyestalk
If we are too safe, we end up with incredibly slow code.Policy
@MikeWeir yes, in C++. Modern languages with perfect compacting GCs are able to internally use use pure pointers with absolute safety, and allocate/deallocate short-lived objects at comparable speeds to stack allocation. The irony is that C++, by attempting to be minimal and efficient, moves this problem into userland and the result is forcing users to choose between slow and safe vs. fast and unsafe. That's why this is a controversial question. If you are mostly modelling relationships between dynamically-allocated objects, C++ is not optimal. This is the situation for most applications.Eyestalk
Isn't the 'undefined behaviour' claim for const std::shared_ptr<std::string> &msg false because of this? According to that article, constant references to local variables are perfectly fine in C++. So when VS 2010 crashes, that would actually be a compiler bug.Jonquil
@Jonquil - As that page says, "Note this only applies to stack-based references. It doesn’t work for references that are members of objects." in the example above, previous_message is not a local variable, but some long-lived state that persists between calls to send_message. By design, a shared_ptr transitions to nullptr when there are no further real instances of shard_ptr pointing to the same object. A &-ref to a shared_ptr is not an instance and does not keep the pointed-to object alive. Changing a shared_ptr to a const shared_ptr& says "No big deal if this becomes null."Eyestalk
M
119

I found myself disagreeing with the highest-voted answer, so I went looking for expert opinons and here they are. From http://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2011-Scott-Andrei-and-Herb-Ask-Us-Anything

Herb Sutter: "when you pass shared_ptrs, copies are expensive"

Scott Meyers: "There's nothing special about shared_ptr when it comes to whether you pass it by value, or pass it by reference. Use exactly the same analysis you use for any other user defined type. People seem to have this perception that shared_ptr somehow solves all management problems, and that because it's small, it's necessarily inexpensive to pass by value. It has to be copied, and there is a cost associated with that... it's expensive to pass it by value, so if I can get away with it with proper semantics in my program, I'm gonna pass it by reference to const or reference instead"

Herb Sutter: "always pass them by reference to const, and very occasionally maybe because you know what you called might modify the thing you got a reference from, maybe then you might pass by value... if you copy them as parameters, oh my goodness you almost never need to bump that reference count because it's being held alive anyway, and you should be passing it by reference, so please do that"

Update: Herb has expanded on this here: http://herbsutter.com/2013/06/05/gotw-91-solution-smart-pointer-parameters/, although the moral of the story is that you shouldn't be passing shared_ptrs at all "unless you want to use or manipulate the smart pointer itself, such as to share or transfer ownership."

Marilynnmarimba answered 13/1, 2012 at 1:48 Comment(16)
Nice find! It's great to see two of the foremost experts on the subject publicly refuting the conventional wisdom on SO.Firmament
"There's nothing special about shared_ptr when it comes to whether you pass it by value, or pass it by reference" -- I really don't agree with this. It IS special. Personally I would rather play it safe, and take the slight performance hit. If there is a particular area of code I need to optimize then sure, I'd look at performance benefits of shared_ptr pass by const ref.Eugenides
It's also interesting to note that, while there was agreement on overuse of shared_ptrs, there wasn't agreement about the pass-by-value-vs.-reference issue.Potman
With std::vector< shared_ptr<T> > you have std::vector::push_back(std::vector< shared_ptr<T> >::value_type &) -- that is, an STL container of shared_ptr accepts reference to shared_ptr in its methods.Hemphill
Scott Meyers: "so if I can get away with it with proper semantics in my program..." i.e. doesn't contradict my answer at all, which points out that figuring out whether changing parameters to const & will affect semantics is only easy in very simple programs.Eyestalk
Herb Sutter: "very occasionally maybe because you know what you called might modify the thing you got a reference from". Again, that little exemption for a minor case, so it doesn't contradict my answer. The question remains: how do you know it's safe to use a const ref? Very easy to prove in a simple program, not so easy in a complex program. But hey, this is C++, and so we favour premature micro-optimisation over almost all other engineering concerns, right?! :)Eyestalk
Having been badly burned by this very recently, I have to disagree with this answer. You can wind up with a long chain of functions calling each other, each taking a reference to a shared pointer passed many functions back. If that innermost function manipulates the collection the shared pointer was found in at the top of the chain, boom, the whole chain implodes. These things can infect a code base and be very difficult to validate as they grow and change. This is a premature micro-optimization.Faery
@David: Do you agree that your issue is not specific to shared_ptr? Should you pass a string (or any other object) by value through a long chain of functions because the innermost one may manipulate the collection it came from?Marilynnmarimba
@Jon: It's not completely specific to shared_ptr. It's specific to particular use cases that have significant overlap with the typical use cases for shared_ptr. One part that is specific to shared_ptr (or other smart pointers) is that programmers don't always think as hard about the objects they're invoking methods on as they do about parameters to those methods. So this can hit you in a blind spot.Faery
@Marilynnmarimba - re: the update on Herb's blog, once again he adds the all-important get-out "Note this assumes the pointer is not aliased." And he doesn't go into what that actually means, or how it might be a problem in a real program. In fact, passing the raw pointer is itself an example of aliasing, it bypasses shared_ptr and leaves the door open for the object to be destroyed while in use. And we can see here what has happened: people just read the headline of the "expert advice". Well, modern debuggers are so nice, so it's really okay that you're going to be spending your evenings in there! :)Eyestalk
-1 Although these "experts" make such claims they are clearly wrong as Daniel Earwicker as already explained. Writing "I found myself disagreeing with the highest-voted answer" when the highest voted answer is clearly a verifiable statement of fact is absurd.Ulpian
I found a very recent comment of Herb Sutter very interesting to this topic. If you have a function where you already know that another shared_ptr dominates the scope of your function call. pass a raw pointer or const reference for performance reasons (and flexibility, because you don't duplicate code when you have a unique_ptr).Dionedionis
In general this is terrible advice. Within the context of this SO question you should stick to best practice which is... don't pre-optimize. Just pass the std::shared_ptr and increment the count. If and only if you find yourself in a situation where performance is being hurt by shared_ptr, not the common case, then you can discuss some advanced, special case programming trick.Penrose
@CameronLowellPalme: I couldn't disagree more. Code is about communication, and passing in a shared_ptr communicates that the function needs the ability to interact with the object's lifetime rather than just with the passed in object. I can't state it better than Sutter did: don't pass a shared_ptr at all "unless you want to use or manipulate the smart pointer itself, such as to share or transfer ownership."Fatling
@JoeGauterin I agree that code is about communication and that it is important to think about the lifetime/contract associated with handing a pointer or data to another function or class. My beef is with being concerned about the cost of shared_ptr. It is this kind of conversation that is why C++ will continue a slow decline into obscurity. It is dominated by minutiae as to what is the absolute 'fastest' way to do x, y, or z. How if you code it this particular way it results in better assembly emitted from the compiler. The answer should be contract focused, not performance focused.Penrose
@CameronLowellPalmer that decline is probably inevitable, as C++ takes micro-optimisation as its "design centre" and yet ironically winds up being slower than modern safe languages for a wide range of common applications. It's entertaining to learn, but there is no commercial sense in using C or C++ except in some niches. Use it it write a JS engine or JVM or CLR, then do everything else in the safe, fast, productive environment. (My most recent use was to allow me to write MS Office extensions in JS+HTML, that would work across all versions of Office).Eyestalk
E
115

The point of a distinct shared_ptr instance is to guarantee (as far as possible) that as long as this shared_ptr is in scope, the object it points to will still exist, because its reference count will be at least 1.

Class::only_work_with_sp(boost::shared_ptr<foo> sp)
{
    // sp points to an object that cannot be destroyed during this function
}

So by using a reference to a shared_ptr, you disable that guarantee. So in your second case:

Class::only_work_with_sp(boost::shared_ptr<foo> &sp) //Again, no copy here  
{    
    ...  
    sp->do_something();  
    ...  
}

How do you know that sp->do_something() will not blow up due to a null pointer?

It all depends what is in those '...' sections of the code. What if you call something during the first '...' that has the side-effect (somewhere in another part of the code) of clearing a shared_ptr to that same object? And what if it happens to be the only remaining distinct shared_ptr to that object? Bye bye object, just where you're about to try and use it.

So there are two ways to answer that question:

  1. Examine the source of your entire program very carefully until you are sure the object won't die during the function body.

  2. Change the parameter back to be a distinct object instead of a reference.

General bit of advice that applies here: don't bother making risky changes to your code for the sake of performance until you've timed your product in a realistic situation in a profiler and conclusively measured that the change you want to make will make a significant difference to performance.

Update for commenter JQ

Here's a contrived example. It's deliberately simple, so the mistake will be obvious. In real examples, the mistake is not so obvious because it is hidden in layers of real detail.

We have a function that will send a message somewhere. It may be a large message so rather than using a std::string that likely gets copied as it is passed around to multiple places, we use a shared_ptr to a string:

void send_message(std::shared_ptr<std::string> msg)
{
    std::cout << (*msg.get()) << std::endl;
}

(We just "send" it to the console for this example).

Now we want to add a facility to remember the previous message. We want the following behaviour: a variable must exist that contains the most recently sent message, but while a message is currently being sent then there must be no previous message (the variable should be reset before sending). So we declare the new variable:

std::shared_ptr<std::string> previous_message;

Then we amend our function according to the rules we specified:

void send_message(std::shared_ptr<std::string> msg)
{
    previous_message = 0;
    std::cout << *msg << std::endl;
    previous_message = msg;
}

So, before we start sending we discard the current previous message, and then after the send is complete we can store the new previous message. All good. Here's some test code:

send_message(std::shared_ptr<std::string>(new std::string("Hi")));
send_message(previous_message);

And as expected, this prints Hi! twice.

Now along comes Mr Maintainer, who looks at the code and thinks: Hey, that parameter to send_message is a shared_ptr:

void send_message(std::shared_ptr<std::string> msg)

Obviously that can be changed to:

void send_message(const std::shared_ptr<std::string> &msg)

Think of the performance enhancement this will bring! (Never mind that we're about to send a typically large message over some channel, so the performance enhancement will be so small as to be unmeasureable).

But the real problem is that now the test code will exhibit undefined behaviour (in Visual C++ 2010 debug builds, it crashes).

Mr Maintainer is surprised by this, but adds a defensive check to send_message in an attempt to stop the problem happening:

void send_message(const std::shared_ptr<std::string> &msg)
{
    if (msg == 0)
        return;

But of course it still goes ahead and crashes, because msg is never null when send_message is called.

As I say, with all the code so close together in a trivial example, it's easy to find the mistake. But in real programs, with more complex relationships between mutable objects that hold pointers to each other, it is easy to make the mistake, and hard to construct the necessary test cases to detect the mistake.

The easy solution, where you want a function to be able to rely on a shared_ptr continuing to be non-null throughout, is for the function to allocate its own true shared_ptr, rather than relying on a reference to an existing shared_ptr.

The downside is that copied a shared_ptr is not free: even "lock-free" implementations have to use an interlocked operation to honour threading guarantees. So there may be situations where a program can be significantly sped up by changing a shared_ptr into a shared_ptr &. But it this is not a change that can be safely made to all programs. It changes the logical meaning of the program.

Note that a similar bug would occur if we used std::string throughout instead of std::shared_ptr<std::string>, and instead of:

previous_message = 0;

to clear the message, we said:

previous_message.clear();

Then the symptom would be the accidental sending of an empty message, instead of undefined behaviour. The cost of an extra copy of a very large string may be a lot more significant than the cost of copying a shared_ptr, so the trade-off may be different.

Eyestalk answered 30/11, 2008 at 10:38 Comment(36)
The shared_ptr that is passed in already lives in a scope, at the call site. You might be able to create an elaborate scenario where the code in this question would blow up due to a dangling pointer, but then I suppose you have bigger problems than the reference parameter!Valuer
It may be stored in a member. You may call something that happens to clear that member. The whole point of smart_ptr is to avoid having to coordinate lifetimes in hierarchies or scopes that nest around the call stack, so that's why it's best to assume that lifetimes don't do that in such programs.Eyestalk
Earwicker: Your scenario is of course real, but it my personal experience not likely. Perhaps we have just been working in different code bases. :) Good thing is, this page contains both viewpoints now ;)Valuer
It's not really my viewpoint though! If you think what I'm saying is something specific to do with my code, you may not have understood me. I'm talking about an unavoidable implication of the reason shared_ptr exists in the first place: many object lifetimes are not simply related to function calls.Eyestalk
Magnus Hoff had the very CORRECT view. In Earwiker's 1st point, it's ridiculous, because if you can't be sure the object is there during function body, how can you be sure the object is there before it makes copy for the function ? In this case, the shared pointer you got inside the function body is not trust worthy -- nothing is trustworthy.Hispanic
@JQ - I've added a lengthy update to hopefully make the point more clearly.Eyestalk
Sorry, that should be @Hispanic with a dot.Eyestalk
The example is still a bit unclear. Why is there need to reset before calling a function? Perhaps it is a precondition of that function, that the previous value is destroyed and it invokes undefined behavior otherwise?Environs
@DanielEarwicker Hi Daniel, thanks for updating the comments for me. I took me looog time to understand around 50%(?) what you are trying to get at. But I found the example not convincing. Reason is: if the user is using send_message() and at the same time he can access to previous_message, he should know what the function does to the variable. If he's not aware, it's a bad design of the function itself.Hispanic
@Environs - perhaps in a real example, the cached object is not a string but some external resource encapsulated in an object, e.g. a file handle or some kind of connection managed by the OS, and we want to tear down the old one before we build a new one.Eyestalk
@JQ - welcome to the real world! :) It's full of APIs that make unnecessary use of mutable state variables. The C function strtok is a nasty old example. The point of the question was "What are the possible bad side-effects" of changing shared_ptr<T> into const shared_ptr<T> &. And in real code, where classes have mutable members, and some member functions depend on those members, and other member functions give access to them, and there are complex networks of objects that hold references to each other, then there are many opportunities for this kind of problem to occur.Eyestalk
So if you pass a member variable by const reference to a function that modifies that member variable before accessing the parameter, your parameter will be modified unexpectedly. This applies to anything, not just shared_ptr. Does this mean the rule of thumb should be "pass all objects by value rather than const reference if they are cheap to copy"?Marilynnmarimba
@Marilynnmarimba - that rule can't be right, because often you want to share references to objects by design, so the fact that it might be cheap to copy doesn't influence your decision. If you are passed a const T& can you assume there will be no changes to its state until your function exits? If you can see that your function never causes any T to be modified, then you're perfectly safe. If you function directly modifies some T (perhaps by calling some other function that does the modification) then yes, you may have a problem waiting to happen. (cont...)Eyestalk
However, the purpose of shared_ptr<T> v is to serve as a better T *v, and what is its specific advantage? That it stops an object from being destroyed prematurely. Change it again to shared_ptr<T> &v and you discard that advantage. The whole point of shared_ptr is that copying it is cheap enough for you to depend on it widely, so that you don't have to be responsible for analysing every aspect of your object's lifetimes. It's for situations where you need sharing by design, and where you have complex overlapping object lifetimes, so the analysis-by-hand would be fiendishly difficult.Eyestalk
It depends what you consider cheap. Copying a shared_ptr requires interlocked reference count modifications, which to me are relatively expensive. As such, I have the rule that shared_ptrs are always passed by const reference, and stored as copies. The only case where this could be problematic is where one is passed by member to a function that happens to reset the member and then expect the pointer to still be valid afterwards -- and that case is vanishingly unlikely and easy to fix by copying to local variable before passing in, which is often needed for concurrency anyway.Coleslaw
@Coleslaw - "and that case is vanishingly unlikely" - great, then there's no problem! :) Not sure I'd assign it that probability myself. Also I prefer to avoid bugs now rather than fix them later, so "ease of fixing" is good, but not as good as being correct in the first place. Finally (as stated above) I prefer to optimise based on measurement; anything that happens infrequently is cheap, anything that happens a lot is expensive.Eyestalk
Passing parameters usually happens quite frequently. Thus you must want it to be done as fast as possible. :) Copying should be done only in the cases where it is actually required, which in turn should be the minority of cases. But as always, it depends on the task at hand, and how much knowledge of the caller you have.Coleslaw
Mind you, it might be a corollary of another rule I try to follow that makes this usage safe -- shared_ptrs must always be returned by value, never by reference. Your example above gets in trouble because previous_message is accessed directly. The class that defines send_message should know that this is unsafe due to the specific implementation, and refrain. Outside code doesn't know the implementation, so may not realise it's unsafe, but they're protected because they can't access previous_message directly, they can only call get_previous_message, which inherently copies (return by value).Coleslaw
@Coleslaw - as I said, it's "a contrived example. It's deliberately simple, so the mistake will be obvious. In real examples, the mistake is not so obvious because it is hidden in layers of real detail." My major concern here is that people might think they can go into some existing well-tested production code, and "enhance" it by changing shared_ptr into a const shared_ptr &, under the mistaken belief that this won't alter the semantics. The necessary analysis across the entire code base, required to make such a change safely, may be fiendishly hard, depending on the complexity.Eyestalk
True, but it's never safe to make that sort of change without knowing what the knock-on effects are going to be. And that's partly what unit tests are for. :) That still doesn't mean that the default for new code should be "pass by value".Coleslaw
@Coleslaw - Absolutely. One possible interpretation of this question is "would this change have any knock-on effects?" Picking the default for new code is a more open issue. C++ is all about premature optimisation, after all! :) One of my rules is to play very safe until you are in a position to make real measurements about where optimisations are needed... but then, another such rule is that when you find you need shared_ptr, you're wandering outside of C++'s core purpose and into the domain of true GC'd languages.Eyestalk
The bug in the second example has nothing to do with shared_ptr. Replace the std::shared_ptr<std::string> argument with anything else -- an std::string, for example -- and the code still breaks by trashing the msg argument through the global previous_message. The only difference is that the std::shared_ptr<std::string> version will crash instead of giving bad output, and that's because send_message() isn't ensuring that the pointer is valid before using it (which it needs to do regardless of whether or not the shared_ptr is passed by reference).Quiescent
@JoshTownzen - see the last part of the answer, beginning "Note that a similar bug would occur if we used std::string throughout..." which says exactly the same thing. You may have missed the point, which is that a program may initially be absolutely correct, but then become absolutely incorrect if you change shared_ptr into const shared_ptr &.Eyestalk
Just a link to another SO answer that agrees that it is safe to pass a reference to a shared pointer. LINK: passing reference is okay, storing reference is not okayOpalopalesce
@TrevorBoydSmith - thanks for letting me know; I've added a comment to that answer explaining why it's wrong.Eyestalk
This answer does not make sense at all. It applies to situations with other variable types too as the writer added later in an edit. So it neither answers the question nor is a correct point of view. The logic of the code written in this post is itself DEEPLY FLAWED!Marginalia
@Tolga - because it applies to other situations with other variable types too, you conclude that it doesn't apply to this situation? That is flawed logic.Eyestalk
@DanielEarwicker completely agree with all your points and am surprised by the level of opposition. Something that makes your concerns even more relevant is threading, when this becomes involved guarantees about an objects validity become much more important. Good answer.Anastassia
Not long ago, I chased down a very serious bug that was due to passing a reference to a shared pointer. The code was handling an object's state change and when it noticed the state of the object had changed, it removed it from the collection of objects in the previous state and moved it into the collection of objects in the new state. The remove operation destroyed the last shared pointer to the object. The member function had been called on a reference to the shared pointer in the collection. Boom. Daniel Earwicker is right.Faery
The opposition is because people expect the common case to be passing a shared pointer from the local scope of the calling function, and indeed this is often safe to pass by reference. I think what Daniel's trying to say (what people often miss) is the risk of these assumptions being violated by people who maintain the code later. If you can safely assume that everyone who'll ever look at the code is an infallible expert then you can do more or less what you like, but years of coding in commercial environments tends to beat that sort of assumption out of most of us. (^_^)Squiggle
@Squiggle - I suspect most people reader advice like Sutter's, but not very carefully. The short version of his advice is Prefer passing objects by value, *, or &, not by smart pointer. But he adds the warning "Note this assumes the pointer is not aliased. You have to be careful if the smart pointer parameter could be aliased, but in this respect it’s no different than any other aliased object." That warning is exactly my answer! I just go into detail about what aliasing means and why it causes problems [cont...]Eyestalk
[cont...] But once you get that, then it becomes a question of whether you want to take the risk of discovering delightfully amusing and subtle aliasing bugs later, in exchange for "fast" code, or whether you'd prefer stronger guarantees of correctness so you can then measure and invest in optimisation effort (with all the care to avoid aliasing bugs that this entails) where it's really necessary. And I strongly recommend the latter.Eyestalk
If we are too safe, we end up with incredibly slow code.Policy
@MikeWeir yes, in C++. Modern languages with perfect compacting GCs are able to internally use use pure pointers with absolute safety, and allocate/deallocate short-lived objects at comparable speeds to stack allocation. The irony is that C++, by attempting to be minimal and efficient, moves this problem into userland and the result is forcing users to choose between slow and safe vs. fast and unsafe. That's why this is a controversial question. If you are mostly modelling relationships between dynamically-allocated objects, C++ is not optimal. This is the situation for most applications.Eyestalk
Isn't the 'undefined behaviour' claim for const std::shared_ptr<std::string> &msg false because of this? According to that article, constant references to local variables are perfectly fine in C++. So when VS 2010 crashes, that would actually be a compiler bug.Jonquil
@Jonquil - As that page says, "Note this only applies to stack-based references. It doesn’t work for references that are members of objects." in the example above, previous_message is not a local variable, but some long-lived state that persists between calls to send_message. By design, a shared_ptr transitions to nullptr when there are no further real instances of shard_ptr pointing to the same object. A &-ref to a shared_ptr is not an instance and does not keep the pointed-to object alive. Changing a shared_ptr to a const shared_ptr& says "No big deal if this becomes null."Eyestalk
B
23

I would advise against this practice unless you and the other programmers you work with really, really know what you are all doing.

First, you have no idea how the interface to your class might evolve and you want to prevent other programmers from doing bad things. Passing a shared_ptr by reference isn't something a programmer should expect to see, because it isn't idiomatic, and that makes it easy to use it incorrectly. Program defensively: make the interface hard to use incorrectly. Passing by reference is just going to invite problems later on.

Second, don't optimize until you know this particular class is going to be a problem. Profile first, and then if your program really needs the boost given by passing by reference, then maybe. Otherwise, don't sweat the small stuff (i.e. the extra N instructions it takes to pass by value) instead worry about design, data structures, algorithms, and long-term maintainability.

Bitty answered 29/11, 2008 at 16:32 Comment(1)
Although litb's answer is technically correct, never underestimate the "laziness" of programmers (I'm lazy too!). littlenag's answer is better, that a reference to a shared_ptr will be unexpected, and possibly (probably) an unnecessary optimization that makes future maintenance more challenging.Textual
T
18

Yes, taking a reference is fine there. You don't intend to give the method shared ownership; it only wants to work with it. You could take a reference for the first case too, since you copy it anyway. But for first case, it takes ownership. There is this trick to still copy it only once:

void ClassA::take_copy_of_sp(boost::shared_ptr<foo> sp) {
    m_sp_member.swap(sp);
}

You should also copy when you return it (i.e not return a reference). Because your class doesn't know what the client is doing with it (it could store a pointer to it and then big bang happens). If it later turns out it's a bottleneck (first profile!), then you can still return a reference.


Edit: Of course, as others point out, this only is true if you know your code and know that you don't reset the passed shared pointer in some way. If in doubt, just pass by value.

Tennessee answered 29/11, 2008 at 14:34 Comment(0)
V
11

It is sensible to pass shared_ptrs by const&. It will not likely cause trouble (except in the unlikely case that the referenced shared_ptr is deleted during the function call, as detailed by Earwicker) and it will likely be faster if you pass a lot of these around. Remember; the default boost::shared_ptr is thread safe, so copying it includes a thread safe increment.

Try to use const& rather than just &, because temporary objects may not be passed by non-const reference. (Even though a language extension in MSVC allows you to do it anyway)

Valuer answered 30/11, 2008 at 14:57 Comment(2)
Yes, I always use const references, I just forgot to put it in my example. Anyway, MSVC allows to bind non-const references to temporaries not for a bug, but because by default it has the property "C/C++ -> Language -> Disable Language Extension" set to "NO". Enable it and it won't compile them...Welladvised
abigagli: Seriously? Sweet! I will enforce this at work, first thing tomorrow ;)Valuer
S
10

In the second case, doing this is simpler:

Class::only_work_with_sp(foo &sp)
{    
    ...  
    sp.do_something();  
    ...  
}

You can call it as

only_work_with_sp(*sp);
Spittoon answered 29/11, 2008 at 16:21 Comment(2)
If you adopt the convention to use object references when you don't need to take a copy of the pointer, it serves to document your intent as well. It also gives you a chance to use a const reference.Eniwetok
Yes, I agree on the usage of references to objects as a mean to express that the called function is not "remembering" anything about that object. Usually I use pointer formal arguments if the function is "keeping track" of the objectWelladvised
N
3

I would avoid a "plain" reference unless the function explicitely may modify the pointer.

A const & may be a sensible micro-optimization when calling small functions - e.g. to enable further optimizations, like inlining away some conditions. Also, the increment/decrement - since it's thread safe - is a synchronization point. I would not expect this to make a big difference in most scenarios, though.

Generally, you should use the simpler style unless you have reason not to. Then, either use the const & consistently, or add a comment as to why if you use it just in a few places.

Newsmagazine answered 29/11, 2008 at 19:12 Comment(0)
B
3

I would advocate passing shared pointer by const reference - a semantics that the function being passed with the pointer does NOT own the pointer, which is a clean idiom for developers.

The only pitfall is in multiple thread programs the object being pointed by the shared pointer gets destroyed in another thread. So it is safe to say using const reference of shared pointer is safe in single threaded program.

Passing shared pointer by non-const reference is sometimes dangerous - the reason is the swap and reset functions the function may invoke inside so as to destroy the object which is still considered valid after the function returns.

It is not about premature optimization, I guess - it is about avoiding unnecessary waste of CPU cycles when you are clear what you want to do and the coding idiom has firmly been adopted by your fellow developers.

Just my 2 cents :-)

Bimetallism answered 28/3, 2010 at 19:37 Comment(1)
See the comment above by David Schwartz "...I chased down a very serious bug that was due to passing a reference to a shared pointer. The code was handling an object's state change and when it noticed the state of the object had changed, it removed it from the collection of objects in the previous state and moved it into the collection of objects in the new state. The remove operation destroyed the last shared pointer to the object. The member function had been called on a reference to the shared pointer in the collection. Boom..."Significs
D
3

It seems that all the pros and cons here can actually be generalised to ANY type passed by reference not just shared_ptr. In my opinion, you should know the semantic of passing by reference, const reference and value and use it correctly. But there is absolutely nothing inherently wrong with passing shared_ptr by reference, unless you think that all references are bad...

To go back to the example:

Class::only_work_with_sp( foo &sp ) //Again, no copy here  
{    
    ...  
    sp.do_something();  
    ...  
}

How do you know that sp.do_something() will not blow up due to a dangling pointer?

The truth is that, shared_ptr or not, const or not, this could happen if you have a design flaw, like directly or indirectly sharing the ownership of sp between threads, missusing an object that do delete this, you have a circular ownership or other ownership errors.

Djambi answered 29/11, 2010 at 0:2 Comment(0)
G
2

One thing that I haven't seen mentioned yet is that when you pass shared pointers by reference, you lose the implicit conversion that you get if you want to pass a derived class shared pointer through a reference to a base class shared pointer.

For example, this code will produce an error, but it will work if you change test() so that the shared pointer is not passed by reference.

#include <boost/shared_ptr.hpp>

class Base { };
class Derived: public Base { };

// ONLY instances of Base can be passed by reference.  If you have a shared_ptr
// to a derived type, you have to cast it manually.  If you remove the reference
// and pass the shared_ptr by value, then the cast is implicit so you don't have
// to worry about it.
void test(boost::shared_ptr<Base>& b)
{
    return;
}

int main(void)
{
    boost::shared_ptr<Derived> d(new Derived);
    test(d);

    // If you want the above call to work with references, you will have to manually cast
    // pointers like this, EVERY time you call the function.  Since you are creating a new
    // shared pointer, you lose the benefit of passing by reference.
    boost::shared_ptr<Base> b = boost::dynamic_pointer_cast<Base>(d);
    test(b);

    return 0;
}
Galvanic answered 25/9, 2014 at 22:56 Comment(0)
R
1

I'll assume that you are familiar with premature optimization and are asking this either for academic purposes or because you have isolated some pre-existing code that is under-performing.

Passing by reference is okay

Passing by const reference is better, and can usually be used, as it does not force const-ness on the object pointed to.

You are not at risk of losing the pointer due to using a reference. That reference is evidence that you have a copy of the smart pointer earlier in the stack and only one thread owns a call stack, so that pre-existing copy isn't going away.

Using references is often more efficient for the reasons you mention, but not guaranteed. Remember that dereferencing an object can take work too. Your ideal reference-usage scenario would be if your coding style involves many small functions, where the pointer would get passed from function to function to function before being used.

You should always avoid storing your smart pointer as a reference. Your Class::take_copy_of_sp(&sp) example shows correct usage for that.

Ragtime answered 30/11, 2008 at 16:0 Comment(2)
"You are not at risk of losing the pointer due to using a reference. That reference is evidence that you have a copy of the smart pointer earlier in the stack" Or a data member...?Eyestalk
Consider the magic of boost::thread and boost::ref: boost::function<int> functionPointer = boost::bind(doSomething, boost::ref( sharedPtrInstance ) ); m_workerThread = new boost::thread( functionPointer ); ... delete sharedPtrInstanceSignifics
M
1

Assuming we are not concerned with const correctness (or more, you mean to allow the functions to be able to modify or share ownership of the data being passed in), passing a boost::shared_ptr by value is safer than passing it by reference as we allow the original boost::shared_ptr to control it's own lifetime. Consider the results of the following code...

void FooTakesReference( boost::shared_ptr< int > & ptr )
{
    ptr.reset(); // We reset, and so does sharedA, memory is deleted.
}

void FooTakesValue( boost::shared_ptr< int > ptr )
{
    ptr.reset(); // Our temporary is reset, however sharedB hasn't.
}

void main()
{
    boost::shared_ptr< int > sharedA( new int( 13 ) );
    boost::shared_ptr< int > sharedB( new int( 14 ) );

    FooTakesReference( sharedA );

    FooTakesValue( sharedB );
}

From the example above we see that passing sharedA by reference allows FooTakesReference to reset the original pointer, which reduces it's use count to 0, destroying it's data. FooTakesValue, however, can't reset the original pointer, guaranteeing sharedB's data is still usable. When another developer inevitably comes along and attempts to piggyback on sharedA's fragile existence, chaos ensues. The lucky sharedB developer, however, goes home early as all is right in his world.

The code safety, in this case, far outweighs any speed improvement copying creates. At the same time, the boost::shared_ptr is meant to improve code safety. It will be far easier to go from a copy to a reference, if something requires this kind of niche optimization.

Mayman answered 3/3, 2010 at 14:24 Comment(0)
E
1

Sandy wrote: "It seems that all the pros and cons here can actually be generalised to ANY type passed by reference not just shared_ptr."

True to some extent, but the point of using shared_ptr is to eliminate concerns regarding object lifetimes and to let the compiler handle that for you. If you're going to pass a shared pointer by reference and allow clients of your reference-counted-object call non-const methods that might free the object data, then using a shared pointer is almost pointless.

I wrote "almost" in that previous sentence because performance can be a concern, and it 'might' be justified in rare cases, but I would also avoid this scenario myself and look for all possible other optimization solutions myself, such as to seriously look at adding another level of indirection, lazy evaluation, etc..

Code that exists past it's author, or even post it's author's memory, that requires implicit assumptions about behavior, in particular behavior about object lifetimes, requires clear, concise, readable documentation, and then many clients won't read it anyway! Simplicity almost always trumps efficiency, and there are almost always other ways to be efficient. If you really need to pass values by reference to avoid deep copying by copy constructors of your reference-counted-objects (and the equals operator), then perhaps you should consider ways to make the deep-copied data be reference counted pointers that can be copied quickly. (Of course, that's just one design scenario that might not apply to your situation).

Eyler answered 2/9, 2011 at 19:46 Comment(0)
C
1

I used to work in a project that the principle was very strong about passing smart pointers by value. When I was asked to do some performance analysis - I found that for increment and decrement of the reference counters of the smart pointers the application spends between 4-6% of the utilized processor time.

If you want to pass the smart pointers by value just to avoid having issues in weird cases as described from Daniel Earwicker make sure you understand the price you paying for it.

If you decide to go with a reference the main reason to use const reference is to make it possible to have implicit upcasting when you need to pass shared pointer to object from class that inherits the class you use in the interface.

Corettacorette answered 11/4, 2013 at 16:30 Comment(0)
P
0

In addition to what litb said, I'd like to point out that it's probably to pass by const reference in the second example, that way you are sure you don't accidentally modify it.

Phelips answered 29/11, 2008 at 17:36 Comment(0)
Z
0
struct A {
  shared_ptr<Message> msg;
  shared_ptr<Message> * ptr_msg;
}
  1. pass by value:

    void set(shared_ptr<Message> msg) {
      this->msg = msg; /// create a new shared_ptr, reference count will be added;
    } /// out of method, new created shared_ptr will be deleted, of course, reference count also be reduced;
    
  2. pass by reference:

    void set(shared_ptr<Message>& msg) {
     this->msg = msg; /// reference count will be added, because reference is just an alias.
     }
    
  3. pass by pointer:

    void set(shared_ptr<Message>* msg) {
      this->ptr_msg = msg; /// reference count will not be added;
    }
    
Zoi answered 23/7, 2013 at 7:23 Comment(0)
V
0

Every code piece must carry some sense. If you pass a shared pointer by value everywhere in the application, this means "I am unsure about what's going on elsewhere, hence I favour raw safety". This is not what I call a good confidence sign to other programmers who could consult the code.

Anyway, even if a function gets a const reference and you are "unsure", you can still create a copy of the shared pointer at the head of the function, to add a strong reference to the pointer. This could also be seen as a hint about the design ("the pointer could be modified elsewhere").

So yes, IMO, the default should be "pass by const reference".

Venerable answered 5/3, 2016 at 7:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.