How actually does a function return by value?

A

4

5

If I have a class A (which returns an object by value ), and two functions f() and g() having difference in just their return variables :

class A
{
    public:
    A () { cout<<"constructor, "; }
    A (const A& ) { cout<<"copy-constructor, "; }
    A& operator = (const A& ) { cout<<"assignment, "; }
    ~A () { cout<<"destructor, "; }
};
    const A f(A x)
    {A y; cout<<"f, "; return y;}

    const A g(A x)
    {A y; cout<<"g, "; return x;}

main()
{
    A a;
    A b = f(a);
    A c = g(a);
}

Now when I execute the line A b = f(a);, it outputs:

copy-constructor, constructor, f, destructor, which is fine assuming that object y in f() is created directly at the destination i.e at the memory location of object b, and no temporaries involved.

While when I execute the line A c = g(a);, it outputs:

copy-constructor, constructor, g, copy-constructor, destructor, destructor,.

So the question is why in the case of g() cant the object be directly created at memory location of c, the way it happened while calling f() ? Why it calls an additional copy-constructor ( which I presume is because of the involvement of temporary ) in the 2nd case ?

Assail answered 6/6, 2012 at 12:54 Comment(2)

If you want the compiler to perform optimizations then you'll have to compile with optimizations enabled. – Nonessential 6/6, 2012 at 13:2

I don't think it has anything to do with compiler optimizations as I've already tried it. – Assail 6/6, 2012 at 13:8

R

3

The difference is that in the g case, you are returning a value that was passed to the function. The standard explicitly states under which conditions the copy can be elided in 12.8p31 and it does not include eliding the copy from a function argument.

Basically the problem is that the location of the argument and the returned object are fixed by the calling convention, and the compiler cannot change the calling convention based on the fact that the implementation (that might not even be visible at the place of call) returns the argument.

I started a short lived blog some time ago (I expected to have more time...) and I wrote a couple of articles about NRVO and copy elision that might help clarify this (or not, who knows :)):

Value semantics: NRVO

Value semantics: Copy elision

Rudiment answered 6/6, 2012 at 13:55 Comment(6)

Thanks a lot. Your "[Un]defined behavior" solved many of my doubts in certainly a "well defined" manner :) But I've got few more doubts for you if you can tell'em : 1. In NRVO, when you used "bool which" to decide from "type x and y" which "type" to return, it seems that my compiler is unable to do the elision( I doubt others too.) – Assail 6/6, 2012 at 16:5

2. In my code if I tie a reference to a temporary i.e " A & b = f(a); " then what happens is that the scope of the object ( supposedly a temporary returned by f(a) ) increases till end brace of main. So this is contradicting two things- 1. As you mentioned in your blog that taking an address of the temporary is illegal, But we're doing it here. 2. how can a temporary last for so long ? – Assail 6/6, 2012 at 16:6

3. Is it so that the scopes of the local member objects of a function and that of its arguments are different. As when I did something like "A a; A b; b = g(a);" then in line "b = g(a)" the destructor of the local object was called before assignment operator and that of argument, after the assignment. – Assail 6/6, 2012 at 16:7

sorry, due to space constraint cudnt write all three doubts in one (hope you'll manage :) ) – Assail 6/6, 2012 at 16:8

1. That is compiler dependent. I would not expect many compilers to pick it up, but it is theoretically possible (have you tried with the highest optimization level?) 2. There is an explicit rule in the standard to enable the life extension when binding a const-reference, and you are not taking the address, only creating a reference (think of a reference as an alias), in this particular case the compiler can remove the reference from the binary and just substitute the reference uses with uses of the original object.... – Abram 6/6, 2012 at 16:58

... the temporary lasts longer than the expression by creating a hidden variable in the stack (_Tmp in the article) and treating it as a local variable. 3. I am not really following what your question is there. After some of the copies are elided you have a few objects: a, b, g_in, g_out, where g_in is the argument and g_out is the returned object. g_in is destroyed right after creating g_out and exiting the function, before g_out is assigned to b; g_out should be destroyed at the end of the full expression, then b is destroyed, then a. – Abram 6/6, 2012 at 17:3

H

7

The problem is that in the second case, you're returning one of the parameters. Given that usually parameter copying occurs at the site of the caller, not within the function (main in this case), the compiler makes the copy, and then is forced to copy it again once it enters g().

From http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

Second, I’ve yet to find a compiler that will elide the copy when a function parameter is returned, as in our implementation of sorted. When you think about how these elisions are done, it makes sense: without some form of inter-procedural optimization, the caller of sorted can’t know that the argument (and not some other object) will eventually be returned, so the compiler must allocate separate space on the stack for the argument and the return value.

Heliport answered 6/6, 2012 at 13:26 Comment(1)

I've yet to find a compiler that will elide the copy when a function parameter is returned --No surprises there, it is impossible to have a calling convention that allows for this, and the standard (agreeably after that article was written) explicitly states that this cannot be done by the compiler. – Abram 6/6, 2012 at 14:37

O

4

Here's a little modification of your code, that will help you to perfectly understand what's going on there:

class A{
public:
    A(const char* cname) : name(cname){
        std::cout << "constructing " << cname << std::endl;
    }
    ~A(){
        std::cout << "destructing " << name.c_str() << std::endl;
    }
    A(A const& a){
        if (name.empty()) name = "*tmp copy*";
        std::cout 
            << "creating " << name.c_str() 
            << " by copying " << a.name.c_str() << std::endl;
    }
    A& operator=(A const& a){
        std::cout
            << "assignment ( "
                << name.c_str() << " = " << a.name.c_str()
            << " )"<< std::endl;
        return *this;
    }
    std::string name;
};

Here's the usage of this class:

const A f(A x){
    std::cout 
        << "// renaming " << x.name.c_str() 
        << " to x in f()" << std::endl;
    x.name = "x in f()";
    A y("y in f()");
    return y;
}

const A g(A x){
    std::cout 
        << "// renaming " << x.name.c_str()
        << " to x in f()" << std::endl;
    x.name = "x in g()";
    A y("y in g()");
    return x;
}

int main(){
    A a("a in main()");
    std::cout << "- - - - - - calling f:" << std::endl;
    A b = f(a);
    b.name = "b in main()";
    std::cout << "- - - - - - calling g:" << std::endl;
    A c = g(a);
    c.name = "c in main()";
    std::cout << ">>> leaving the scope:" << std::endl;
    return 0;
}

and here's the output when compiled without any optimization:

constructing a in main()
- - - - - - calling f:
creating *tmp copy* by copying a in main()
// renaming *tmp copy* to x in f()
constructing y in f()
creating *tmp copy* by copying y in f()
destructing y in f()
destructing x in f()
- - - - - - calling g:
creating *tmp copy* by copying a in main()
// renaming *tmp copy* to x in f()
constructing y in g()
creating *tmp copy* by copying x in g()
destructing y in g()
destructing x in g()
>>> leaving the scope:
destructing c in main()
destructing b in main()
destructing a in main()

The output you posted is the output of program compiled with Named Return Value Optimization. In this case the compiler tries to eliminate redundant Copy constructor and Destructor calls which means that when returning the object, it will try to return the object without creating redundant copy of it. Here's the output with NRVO enabled:

constructing a in main()
- - - - - - calling f:
creating *tmp copy* by copying a in main()
// renaming *tmp copy* to x in f()
constructing y in f()
destructing x in f()
- - - - - - calling g:
creating *tmp copy* by copying a in main()
// renaming *tmp copy* to x in f()
constructing y in g()
creating *tmp copy* by copying x in g()
destructing y in g()
destructing x in g()
>>> leaving the scope:
destructing c in main()
destructing b in main()
destructing a in main()

In first case, *tmp copy* by copying y in f() is not created since NRVO has done its job. In second case though NRVO can't be applied because another candidate for return slot has been declared within this function. For more information see: C++ : Avoiding copy with the "return" statement :)

Oubliette answered 6/6, 2012 at 13:37 Comment(2)

Yeah I know this and I've done it in my code as well to see what exactly was going on (though I've posted a simplified version of the code emphasizing only on what my problem was). And this code serves no purpose to the question I'm askin. What I was asked was the REASON for what is happening, not the happening itself. Anyways, thanks for showing concern :) – Assail 6/6, 2012 at 14:1

@cirronimbo: Check my answer now, it explains what's going on with NRVO enabled and also explains why I suggested you that question. – Oubliette 6/6, 2012 at 14:16

R

3

The difference is that in the g case, you are returning a value that was passed to the function. The standard explicitly states under which conditions the copy can be elided in 12.8p31 and it does not include eliding the copy from a function argument.

Basically the problem is that the location of the argument and the returned object are fixed by the calling convention, and the compiler cannot change the calling convention based on the fact that the implementation (that might not even be visible at the place of call) returns the argument.

I started a short lived blog some time ago (I expected to have more time...) and I wrote a couple of articles about NRVO and copy elision that might help clarify this (or not, who knows :)):

Value semantics: NRVO

Value semantics: Copy elision

Rudiment answered 6/6, 2012 at 13:55 Comment(6)

Thanks a lot. Your "[Un]defined behavior" solved many of my doubts in certainly a "well defined" manner :) But I've got few more doubts for you if you can tell'em : 1. In NRVO, when you used "bool which" to decide from "type x and y" which "type" to return, it seems that my compiler is unable to do the elision( I doubt others too.) – Assail 6/6, 2012 at 16:5

2. In my code if I tie a reference to a temporary i.e " A & b = f(a); " then what happens is that the scope of the object ( supposedly a temporary returned by f(a) ) increases till end brace of main. So this is contradicting two things- 1. As you mentioned in your blog that taking an address of the temporary is illegal, But we're doing it here. 2. how can a temporary last for so long ? – Assail 6/6, 2012 at 16:6

3. Is it so that the scopes of the local member objects of a function and that of its arguments are different. As when I did something like "A a; A b; b = g(a);" then in line "b = g(a)" the destructor of the local object was called before assignment operator and that of argument, after the assignment. – Assail 6/6, 2012 at 16:7

sorry, due to space constraint cudnt write all three doubts in one (hope you'll manage :) ) – Assail 6/6, 2012 at 16:8

1. That is compiler dependent. I would not expect many compilers to pick it up, but it is theoretically possible (have you tried with the highest optimization level?) 2. There is an explicit rule in the standard to enable the life extension when binding a const-reference, and you are not taking the address, only creating a reference (think of a reference as an alias), in this particular case the compiler can remove the reference from the binary and just substitute the reference uses with uses of the original object.... – Abram 6/6, 2012 at 16:58

... the temporary lasts longer than the expression by creating a hidden variable in the stack (_Tmp in the article) and treating it as a local variable. 3. I am not really following what your question is there. After some of the copies are elided you have a few objects: a, b, g_in, g_out, where g_in is the argument and g_out is the returned object. g_in is destroyed right after creating g_out and exiting the function, before g_out is assigned to b; g_out should be destroyed at the end of the full expression, then b is destroyed, then a. – Abram 6/6, 2012 at 17:3

N

0

it can (almost) optimise the entire g() function call away, in which case your code looks like this:

A a;
A c = a;

as effectively this is what your code is doing. Now, as you pass a as a by-value parameter (ie not a reference) then the compiler almost has to perform a copy there, and then it returns this parameter by value, it has to perform another copy.

In the case of f(), as it it returning what is effectively a temporary, into a uninitialised variable, the compiler can see that it is safe to use c as the storage for the internal variable inside f().

Nikolos answered 6/6, 2012 at 13:55 Comment(0)

Recommended topics

Hot tags