Why is a combination of numpy functions faster than np.mean?
Asked Answered
L

2

7

I am wondering what the fastest way for a mean computation is in numpy. I used the following code to experiment with it:

import time
n = 10000
p = np.array([1] * 1000000)

t1 = time.time()
for x in range(n):
    np.divide(np.sum(p), p.size)
t2 = time.time()

print(t2-t1)

3.9222593307495117

t3 = time.time()
for x in range(n):
    np.mean(p)
t4 = time.time()

print(t4-t3)

5.271147012710571

I would assume that np.mean would be faster or at least equivalent in speed, however it looks like the combination of numpy functions is faster than np.mean. Why is the combination of numpy functions faster?

Lammastide answered 18/12, 2022 at 16:3 Comment(1)
Confirmed, I get the same result on my computer. That's a strange result! It boils down to np.sum() being faster than np.mean(). (The np.divide call is trivial since you're just giving it two numbers as inputs, not arrays.) I'd have to inspect the underlying implementation of the two functions to understand why. I find it surprising that under the hood, np.mean() doesn't basically run your alternative code...Decameter
N
9

For integer input, by default, numpy.mean computes the sum in float64 dtype. This prevents overflow errors, but it requires a conversion for every element.

Your code with numpy.sum only converts once, after the sum has been computed.

Nailhead answered 18/12, 2022 at 16:41 Comment(7)
This is supported by my experiment of summing random(1000000) instead of [1] * 1000000. In this case, np.mean is actually faster, well quite exactly the same actually.Indemnification
Thanks for the answer! However, I am not really sure how to recreate the experiment of dr. V. Do I have to create a list with number that have to be stored in float64 dtype to see the same performance?Lammastide
@KevinvanderGugten: One way would be to set p2 = p.astype('float64') and then try the timings with p2. (Make sure to put the conversion outside the timing code.)Nailhead
Thx! The results are exactly as you guys said.Lammastide
Using ipython timeit and a large float x, the differences among np.mean(x), np.sum(x)/x.size, x.sum()/x.size and np.add.reduce(x)/x.size are statistically negligible.Bibelot
For integer arrays, np.mean(x) times about the same as np.add.reduce(x,dtype=float)/x.sizeBibelot
On np 1.26.4 I cannot reproduce this. np.sum(x) / len(x) always seems faster. Using np.arange(20, dtype=np.float64).Towhaired
D
-4

The reason why the combination of np.sum and np.divide is faster than np.mean in this case is that np.mean is implemented in Python, while np.sum and np.divide are implemented in C, which makes them faster.

np.mean calls np.sum internally and then divides the result by the size of the array, so the combination of np.sum and np.divide is essentially doing the same thing as np.mean, but without the overhead of calling a Python function.

In general, it is usually faster to use NumPy functions that are implemented in C rather than Python, especially for large arrays.

Dactylogram answered 18/12, 2022 at 16:39 Comment(7)
I don't really believe this explanation. The overhead is constant and negligible.Indemnification
@PranavHosangadi github.com/numpy/numpy/blob/master/numpy/core/… | github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/… | github.com/numpy/numpy/blob/master/numpy/core/src/umath/… You can see that np.mean is a Python function, while np.sum and np.divide are C functions. This is why np.sum and np.divide are generally faster than np.mean for large arrays.Dactylogram
@Dr.V Many of Python's built-in functions are written in C, which makes them much faster than a pure python solution.Dactylogram
While np.mean is python, most of that code is setup code. What about the _methods._mean? np.sum has a setup, that ends up delegating to a wrap add/sum. Calling layers can be a big part of the evaluation time for small arrays, but that becomes negligible when working with large ones.Bibelot
@Bibelot np.mean is both python and c code. The implementation of the np.mean function in NumPy consists of both Python code and C code. The Python code handles the input parsing, error checking, and dispatching to the appropriate C implementation based on the input data type and other factors. The C code contains the actual implementation of the mean calculation for different data types. The _methods._mean function you mentioned is a part of the Python code for np.mean. Too bad comments are limited, i could go on with this. The best thing todo is just to lookup the source on github.Dactylogram
I can easily see the source for np.mean from the docs or ipython ??. _mean takes just bit more search. It looks like the core, size dependent calculation, is done with np.add.reduce. I believe x.sum() also uses add.reduce, with a small function call time delay. np.sum has a larger dispatch delay.Bibelot
You can edit your answer to include code - if you think that's significant.Bibelot

© 2022 - 2025 — McMap. All rights reserved.