Which operator (+ vs +=) should be used for performance? (In-place Vs not-in-place)
Asked Answered
D

1

12

Let's say I have this two snippet of code in python :

1 --------------------------
import numpy as np
x = np.array([1,2,3,4])
y = x
x = x + np.array([1,1,1,1])
print y

2 --------------------------
import numpy as np
x = np.array([1,2,3,4])
y = x
x += np.array([1,1,1,1])
print y

I thought the result of y will be the same in both examples since y point out to x and x become (2,3,4,5), BUT it wasn't

The results were (1,2,3,4) for 1 and (2,3,4,5) for 2.

After some research I find out that in first example

#-First example---------------------------------------
x = np.array([1,2,3,4]) # create x --> [1,2,3,4] 
y = x                   # made y point to x
# unril now we have x --> [1,2,3,4]
#                          |
#                          y
x = x + np.array([1,1,1,1]) 
# however this operation **create a new array** [2,3,4,5] 
# and made x point to it instead of the first one
# so we have y --> [1,2,3,4] and x --> [2,3,4,5]

#-Second example--------------------------------------
x = np.array([1,2,3,4]) # create x --> [1,2,3,4] 
y = x                   # made y point to x
# unril now the same x --> [1,2,3,4]
#                            |
#                            y
x += np.array([1,1,1,1])
# this operation **Modify the existing array**
# so the result will be
# unril now the same x --> [2,3,4,5]
#                            |
#                            y

You can find out more about this behaviors (not only for this example) in this link In-place algorithm

My question is : Being aware of this behavior why should I use in-place algorithm in term of performance? (time of excution faster? less memory alocation?..)

EDIT : Clarification

The example of (+, =+) was just to explain simply the in-place algorithm to the one who don't know.. but the question was in general the use of in-place algorithm not only in this case..

As another more complex example: loading a CSV file (just 10 Million rows) in a variable then sorting the result, is the idea of in-place algorithm is to produce an output in the same memory space that contains the input by successively transforming that data until the output is produced? - This avoids the need to use twice the storage - one area for the input and an equal-sized area for the output ( Using the minimum amount of RAM, hard disk ... )

Detergent answered 15/11, 2017 at 12:31 Comment(11)
I think you should use the operator that semantically makes more sense as that will perform better in terms of development time spent to make the code do what you ask (and keep the code doing what you ask). If you profile and find that you have a bottleneck somewhere, then you can optimize by doing experiments with real data and at that point you can definitively say which way is faster (or if there is no difference at all). My guess is that in most cases, numpy will have a free block of memory so there won't be a huge performance difference, but I could be wrong.Edirne
If you care about this sort of micro-optimization, don't use Python. If the job is so complex/large that a minute difference matters it's probably worth re-coding it in Java or even C.Projectile
I don't understand the question. Since there is a clear semantic difference between + and += for numpy, what does performance matter?Resign
Why the downvotes guys? They are different operands...Palp
Its not about what gains this might give OP. This is an interesting question in general!Palp
Because += is intended to modify the underlying object, it can be made more efficient (by only modifying parts of the original object, rather than modifying a copy).Garibaldi
But he is askign about performance, knowing they do different things!Palp
@Resign the performance come when I want to modify a variable, The idea is to produce an output in the same memory space that contains the input by successively transforming that data until the output is produced. This avoids the need to use twice the storage - one area for the input and an equal-sized area for the output.Detergent
Well, x and y point to the same array, so if you modify x, you also modify y. To me, this question seems to be a mix of "difference of + and += on lists" and "making a copy of a list".Resign
@Resign that's not true when you don't use += if you modify x by x = x + som the programe alocate a new place in the memory and store the result there and make x point to the new result while y still pointing to the old oneDetergent
@Detergent Yes, exactly as with + vs. += for lists. The point is, with +, you do not modify x; you create a new array and assign that to x.Resign
T
6

x = x + 1 vs x += 1

Performance

It seems that you understand the semantical difference between x += 1 and x = x + 1.

For benchmarking, you can use timeit in IPython.

After defining those functions:

import numpy as np
def in_place(n):
    x = np.arange(n)
    x += 1

def not_in_place(n):
    x = np.arange(n)
    x = x + 1

def in_place_no_broadcast(n):
    x = np.arange(n)
    x += np.ones(n, dtype=np.int)

You can simply use the %timeit syntax to compare performances:

%timeit in_place(10**7)
20.3 ms ± 81.4 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit not_in_place(10**7)
30.4 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit in_place_no_broadcast(10**7)
35.4 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

not_in_place is 50% slower than in_place.

Note that broadcasting also makes a huge difference : numpy understands x += 1 as adding a 1 to every single element of x, without having to create yet another array.

Warning

in_place should be the preferred function: it's faster and uses less memory. You might run into bugs if you use and mutate this object at different places in your code, though. The typical example would be :

x = np.arange(5)
y = [x, x]
y[0][0] = 10
y
# [array([10,  1,  2,  3,  4]), array([10,  1,  2,  3,  4])]

Sorting

Your understanding of the advantages of in-place sorting is correct. It can make a huge difference in memory requirements when sorting large data sets.

There are other desirable features for a sorting algorithm (stable, acceptable worst-case complexity, ...) and it looks like the standard Python algorithm (Timsort) has many of them.

Timsort is an hybrid algorithm. Some parts of it are in-place and some require extra memory. It will never use more than n/2 though.

Tapp answered 15/11, 2017 at 13:10 Comment(3)
yes I got it, while y = [x, x] you didn't create a new variable you just made y point to the x.. so this y[0][0] = 10 will change the result of x to become [10,1,2,3,4]...Detergent
@AnouarZ: Exactly. This behaviour can be more subtly hidden though, e.g. with this bug/feature.Tapp
i was looking into the broadcasting algorith and the post you linked. it was wrotten there that broadcasting may be a bad idea, I craeted another question to that subject in case you have explaination to how and when that happenedDetergent

© 2022 - 2024 — McMap. All rights reserved.