int vs const int& [closed]
Asked Answered
W

7

76

I've noticed that I usually use constant references as return values or arguments. I think the reason is that it works almost the same as using non-reference in the code. But it definitely takes more space and function declarations become longer. I'm OK with such code but I think some people my find it a bad programming style.

What do you think? Is it worth writing const int& over int? I think it's optimized by the compiler anyway, so maybe I'm just wasting my time coding it, a?

Weary answered 16/1, 2011 at 13:27 Comment(3)
how could it be optimized by the compiler? If the function is inlined the compiler it might, but for the general case where it isn't it has to return exactly what you said it would (except for some exceptions), since you might be calling it from a different translation unit using the specified prototype, so it has to be called with int/const int& (on assembly level passing an int to something expecting int& isn't a good idea). Combine that with the fact that const int& is likely slower then int (due to the added deref and the cheap copy) and you should now what to do.Pigweed
For the optimizer a const T& is just T&, in other words just a constant pointer to a possibly changing T object. The const word is only seen by programmers.Prewar
@6502: Well, and by the compiler's front end, which steps in if the programmer fails to heed the const.Iny
P
172

In C++ it's very common what I consider an anti-pattern that uses const T& like a smart way of just saying T when dealing with parameters. However a value and a reference (no matter if const or not) are two completely different things and always and blindly using references instead of values can lead to subtle bugs.

The reason is that when dealing with references you must consider two issues that are not present with values: lifetime and aliasing.

Just as an example one place where this anti-pattern is applied is the standard library itself, where std::vector<T>::push_back accepts as parameter a const T& instead of a value and this can bite back for example in code like:

std::vector<T> v;
...
if (v.size())
    v.push_back(v[0]); // Add first element also as last element

This code is a ticking bomb because std::vector::push_back wants a const reference but doing the push_back may require a reallocation and if that happens means that after the reallocation the reference received would not be valid any more (lifetime issue) and you enter the Undefined Behavior realm¹.

Much better from a logical point of view in today C++ would be to accept a value (i.e. void std::vector<T>::push_back(T x)) and then efficiently moving that value in the final place inside the container. Then the caller may eventually use std::move if that is deemed important (note however that the idea of moving construction was not present in original C++).

Aliasing issues are instead a source of subtle problems if const references are used instead of values. I've been bitten for example by code of this kind:

struct P2d
{ 
    double x, y;
    P2d(double x, double y) : x(x), y(y) {}
    P2d& operator+=(const P2d& p) { x+=p.x; y+=p.y; return *this; }
    P2d& operator-=(const P2d& p) { x-=p.x; y-=p.y; return *this; }
};

struct Rect
{
    P2d tl, br;
    Rect(const P2d& tl, const P2d& br) : tl(tl), bt(br) {}
    Rect& operator+=(const P2d& p) { tl+=p; br+=p; return *this; }
    Rect& operator-=(const P2d& p) { tl-=p; br-=p; return *this; }
};

The code seems at a first glance pretty safe, P2d is a bidimensional point, Rect is a rectangle and adding/subtracting a point means translating the rectangle.

If however to translate the rectangle back in the origin you write myrect -= myrect.tl; the code will not work because the translation operator has been defined accepting a reference that (in that case) is referencing a member of same instance.

This means that after updating the topleft with tl -= p; the topleft will be (0, 0) as it should but also p will become at the same time (0, 0) because p is just a reference to the top-left member and so the update of bottom-right corner will not work because it will translate it by (0, 0) hence doing basically nothing.

Please don't be fooled into thinking that a const reference is like a value because of the word const. That word exists only to give you compile errors if you try to change the referenced object using that reference, but doesn't mean that the referenced object is constant. More specifically the object referenced by a const ref can change (e.g. because of aliasing) and can even get out of existence while you are using it (lifetime issue).

In const T& the word const expresses a property of the reference, not of the referenced object: it's the property that makes impossible to use it to change the object. Probably readonly would have been a better name as const has IMO the psychological effect of pushing the idea that the object is going to be constant while you use the reference.

You can of course get impressive speedups by using references instead of copying the values, especially for big classes. But you should always think about aliasing and lifetime issues when using references because under the cover they're just pointers to other data. For "native" data types (ints, doubles, pointers) references however are actually going to be slower than values and there's nothing to gain in using them instead of values.

Also a const reference will always mean problems for the optimizer as the compiler is forced to be paranoid and every time any unknown code is executed it must assume that all referenced objects may have now a different value (const for a reference means absolutely NOTHING for the optimizer; that word is there only to help programmers - I'm personally not so sure it's such a big help, but that's another story).


(1) Apparently (https://mcmap.net/q/166631/-is-it-safe-to-push_back-an-element-from-the-same-vector) the standard says that this case is valid but even with this interpretation (on which I do not agree at all) still the problem is present in general. push_back doesn't care about the identity of the object and so should have taken the argument by value. When you pass a const reference to a function it's YOUR responsibility to ensure that the referenced object will stay alive for the full duration of the function. With v.push_back(v[0]) this is simply false if no reservation was done and IMO (given the push_back signature) is a caller's fault if that happens. The real logic bug is however the push_back interface design (done intentionally, sacrificing logical correctness on the altar of efficiency). Not sure if it was because of that defect report but I saw a few compilers "fixing" the problem in this special case (i.e. push_back does a check to see if the element being pushed is coming from the vector itself).

Prewar answered 16/1, 2011 at 14:27 Comment(14)
Great answer, I never thought push_back() could be so sinister!Aspen
"[const] exists only to give you compile errors if you try to change the referenced object using that reference, but doesn't mean that the referenced object is constant" Couldn't have said it better.Gilliam
I don't get the example with std::vector. Right after reallocation the data will be copied and then v[0] will be OK?Larondalarosa
@cldy: When calling std::vector::push_back that code is passing a reference to the first element of the vector, in other words an address in memory where this element is stored. The push_back code may need to reallocate the vector contents to make room for a new element and after doing that reallocation it will copy construct the new entry using the passed address as source. However if an allocation happened the address will be the one of a now dead object that has been reallocated somewhere else. It is a problem of lifetime... during the execution of push_back the object was destroyed.Prewar
@Prewar Just out of curiosity... Is the push_back problem a problem that is solved by the method (so that it's coded to for example internally make an extra copy when he has to realloc), or is it a "feature" users of push_back must know?Precedential
xanatos: users must know it. When you pass a reference instead of a value it's your responsibility to guarantee that the lifetime of the object being referenced will last enough. Elements of a vector will not last the whole execution of push_back so passing to push_back an element of the same vector is a violation of the contract.Prewar
One thing that might be worth adding is that references can actually result in slower code when the reference needs to be copied (and sometimes even if it doesn't need to be copied.) The canonical article for this is cpp-next.com/archive/2009/08/want-speed-pass-by-value But see also the (very interesting) presentation by Chandler Carruth that nicely shows how passing by reference can severely hurt the compiler's optimizer: youtube.com/watch?v=eR34r7HOU14Levey
@6502: You says that "More specifically the object referenced by a const ref can change (e.g. because of aliasing) and can even get out of existence while you are using it (lifetime issue)." will you please explain these two things by giving simple example? I still not understand it. ThanksTarragon
@PravasiMeet: an example of lifetime issue is int foo(const int& x, std::vector<int>& y) { y.pop_back(); return x; } is UB when called as foo(a.back(), a) because even if x is a const X& the referenced object is destroyed before being accessed. For aliasing consider just int foo(const int& x, int& y) and called with foo(a, a)... if foo code mutates y the mutation will also appear in x (despite it being declared const int&). This is exactly what provokes the subtle bug in the Rect class example.Prewar
@NikosC. These are interesting links, but as always, benchmark. You'll notice Carruth mainly emphasized return by value versus an out reference, because passing large objects by constant reference can be much faster. At the end of the day, the common advice in popular books is mostly correct: For small objects, pass by value, but for large objects, pass by constant reference. I did a simple test of a struct with one std::vector passed by reference and value. The first was 133 lines long and the second was 174 lines long of assembly. -O3. Similar results for -O2.Karafuto
@Karafuto The issue is that if you need to store the object, then passing by value and doing std::move instead is going to be the better way of doing it compared to passing by ref and then copy.Levey
@NikosC. That's an interesting setup although it can only be used if you don't need those arguments to be used with their value later on after the function call. It also requires remembering more stuff (to use std::move). For -O2 in the case of that struct, it was only 1 extra instruction, so which is faster will depend on the instructions used. Let me test having 3 std::vectors. Move: 241 instructions. By reference: 170 instructions.Karafuto
@Karafuto Not sure you understood. If your function takes so called "sink" arguments, it makes no sense to take them by reference and then have to copy them. This results in an unavoidable copy. There is nothing the caller of the function can do to eliminate the copy. You take those args by value instead, and then move the args to their destination. The caller of the function can now eliminate the copy, if they want, with std::move(). If moving is not appropriate because they still need the objects, they simply don't use move and this get copy behavior, just like with ref args.Levey
@NikosC. Yes... if you can use a move for efficiency, that's more efficient than constructing a copy of a constant reference. This has nothing to do with "passing by value" "for speed". However, even if you can move and have no need to construct a new object, a pass by constant reference will likely be faster or just as fast. You are relying on the optimizer eliminating the move in that situation. Plus, constant references are so common that they are highly optimized wherever possible. The salient point of the video you linked is that a reference out variable is slow.Karafuto
B
16

As Oli says, returning a const T& as opposed to T are completely different things, and may break in certain situations (as in his example).

Taking const T& as opposed to plain T as an argument is less likely to break things, but still have several important differences.

  • Taking T instead of const T& requires that T is copy-constructible.
  • Taking T will invoke the copy constructor, which may be expensive (and also the destructor on function exit).
  • Taking T allows you to modify the parameter as a local variable (can be faster than manually copying).
  • Taking const T& could be slower due to misaligned temporaries and the cost of indirection.
Boudicca answered 16/1, 2011 at 13:37 Comment(8)
Also, when T is a small type, such as int, the copy could be cheaper than the reference.Ashaashamed
@Peter: Could you expand on the last point? (misaligned temporaries)Apologize
Taking a const&, you'll see directly modifications made on an alias.Bibbye
@Oli: A reference is usually implemented as a pointer, and there's no guarantees about the alignment of what it is pointing to. Depending on the architecture, operating on misaligned data can cause performance issues.Boudicca
@Oli: Even if there are no performance penalties due to misalignment, the mere fact that a reference is implemented as a pointer requires the value to be stored in main memory, which can be between one (L1 cache) and ten (page fault) orders of magnitude slower than passing by register.Chavez
Why would temporaries be unaligned? Unless you stray into undefined behavior, all objects in C++ are by definition aligned.Tobias
re your first bullet, const T& may also require that T is copy-constructible; see here #4733948Chaddie
@Chavez You happen to be wrong about "requires the value to be stored in main memory" when the compiler is able to optimize caller and callee together. Whether that happens depends on all sorts of factors, such as whether the callee is marked inline or whether LTO is in use.Galosh
C
10

If the callee and the caller are defined in separate compilation units, then the compiler cannot optimize away the reference. For example, I compiled the following code:

#include <ctime>
#include <iostream>

int test1(int i);
int test2(const int& i);

int main() {
  int i = std::time(0);
  int j = test1(i);
  int k = test2(i);
  std::cout << j + k << std::endl;
}

with G++ on 64-bit Linux at optimization level 3. The first call needs no access to main memory:

call    time
movl    %eax, %edi     #1
movl    %eax, 12(%rsp) #2
call    _Z5test1i
leaq    12(%rsp), %rdi #3
movl    %eax, %ebx
call    _Z5test2RKi

Line #1 directly uses the return value in eax as argument for test1 in edi. Line #2 and #3 push the result into main memory and place the address in the first argument because the argument is declared as reference to int, and so it must be possible to e.g. take its address. Whether something can be calculated entirely using registers or needs to access main memory can make a great difference these days. So, apart from being more to type, const int& can also be slower. The rule of thumb is, pass all data that is at most as large as the word size by value, and everything else by reference to const. Also pass templated arguments by reference to const; since the compiler has access to the definition of the template, it can always optimize the reference away.

Chavez answered 16/1, 2011 at 14:22 Comment(1)
passing by value often also increases cache locality, so eventually less cache lines have to be loaded into the cache.Transparent
A
9

int & and int are not interchangeable! In particular, if you return a reference to a local stack variable, the behaviour is undefined, e.g.:

int &func()
{
    int x = 42;
    return x;
}

You can return a reference to something that won't be destroyed at the end of the function (e.g. a static, or a class member). So this is valid:

int &func()
{
    static int x = 42;
    return x;
}

and to the outside world, has the same effect as returning the int directly (except that you can now modify it, which is why you see const int & a lot).

The advantage of the reference is that no copy is required, which is important if you're dealing with large class objects. However, in many cases, the compiler can optimize that away; see e.g. http://en.wikipedia.org/wiki/Return_value_optimization.

Apologize answered 16/1, 2011 at 13:30 Comment(2)
Yeah, I know the difference between those two. My question was if it's actually worth using const int& as int should be optimized anyway.Weary
@Valdo: In the case of int, there's certainly no advantage on any sane platform.Apologize
C
4

Instead of "thinking" it's optimized away by the compiler, why don't you get the assembler listing and find out for sure?

junk.c++:

int my_int()
{
    static int v = 5;
    return v;
}

const int& my_int_ref()
{
    static int v = 5;
    return v;
}

Generated assembler output (elided):

_Z6my_intv:
.LFB0:
    .cfi_startproc
    .cfi_personality 0x3,__gxx_personality_v0
    movl    $5, %eax
    ret
    .cfi_endproc

...

_Z10my_int_refv:
.LFB1:
    .cfi_startproc
    .cfi_personality 0x3,__gxx_personality_v0
    movl    $_ZZ10my_int_refvE1v, %eax
    ret

The movl instructions in both are very different. The first moves 5 into EAX (which happens to be the register traditionally used to return values in x86 C code) and the second moves the address of a variable (specifics elided for clarity) into EAX. That means the calling function in the first case can just directly use register operations without hitting memory to use the answer while in the second it has to hit memory through the returned pointer.

So it looks like it's not optimized away.

This is over and above the other answers you've been given here explaining why T and const T& are not interchangeable.

Cretic answered 16/1, 2011 at 13:50 Comment(0)
A
0

int is different with const int&:

  1. const int& is the reference to another integer variable (int B), which means: if we change int B, the value of const int& will also change.

2, int is the value copy of another integer variable (int B), which means: if we change int B, the value of int will not change.

See the following c++ code:

int main(){

vector a{1,2,3};

int b = a[2];//the value not change even when vector change

const int& c = a[2];//this is reference, so the value depend on vector;

a[2]=111;

// b will output 3;

// c will output 111;

}

Aretta answered 15/5, 2019 at 4:42 Comment(0)
S
0

const int & likely still needs to pass a pointer around, that is very comparable to int by size. It is very unlikely to bring any notably better performance.

Passing const references may need some attention, to check it will not be an unexpected change of the value (say some function you call also has access to it, or even another thread). But this is usually trivially visible, unless the variable has unusually long life span and very broad access scope.

Sloat answered 15/7, 2020 at 8:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.