np.mean() vs np.average() in Python NumPy?
Asked Answered
P

5

261

I notice that

In [30]: np.mean([1, 2, 3])
Out[30]: 2.0

In [31]: np.average([1, 2, 3])
Out[31]: 2.0

However, there should be some differences, since after all they are two different functions.

What are the differences between them?

Protolanguage answered 18/11, 2013 at 17:43 Comment(4)
Actually, the documentation doesn't make it immediately clear, as far as I can see. Not saying it is impossible to tell, but I think this question is valid for Stack Overflow all the same.Reagan
numpy.mean : Returns the average of the array elements.Tarsus
@joaquin: "Compute the arithmetic mean along the specified axis." vs "Compute the weighted average along the specified axis."?Brittenybrittingham
@Brittenybrittingham right. I was just trying to make a kind of funny response to your comment because if I follow your instructions the first thing I read in the docs for numpy.mean is numpy.mean : Returns the average of the array elements which is funny if you are looking for the answer to the OP question.Tarsus
C
236

np.average takes an optional weight parameter. If it is not supplied they are equivalent. Take a look at the source code: Mean, Average

np.mean:

try:
    mean = a.mean
except AttributeError:
    return _wrapit(a, 'mean', axis, dtype, out)
return mean(axis, dtype, out)

np.average:

...
if weights is None :
    avg = a.mean(axis)
    scl = avg.dtype.type(a.size/avg.size)
else:
    #code that does weighted mean here

if returned: #returned is another optional argument
    scl = np.multiply(avg, 0) + scl
    return avg, scl
else:
    return avg
...
Chesser answered 18/11, 2013 at 17:51 Comment(3)
Why do they offer two different functions? Seems they should just offer np.average since weights is already optional. Seems unnecessary and only serves to confuse users.Precentor
@Precentor I would rather have them throw a NotImplementedException for "average", to educate users that the arithmetic mean is not identical to "the average".Pyrogenous
@Precentor this and that answers actually tell you why there is a need for these two functions.Ayers
H
48

np.mean always computes an arithmetic mean, and has some additional options for input and output (e.g. what datatypes to use, where to place the result).

np.average can compute a weighted average if the weights parameter is supplied.

Hylan answered 18/11, 2013 at 17:50 Comment(0)
A
33

In some version of numpy there is another imporant difference that you must be aware:

average do not take in account masks, so compute the average over the whole set of data.

mean takes in account masks, so compute the mean only over unmasked values.

g = [1,2,3,55,66,77]
f = np.ma.masked_greater(g,5)

np.average(f)
Out: 34.0

np.mean(f)
Out: 2.0
Angst answered 5/8, 2016 at 7:40 Comment(3)
Note: np.ma.average works. Also, there is a bug report.Monophyletic
np.average and np.mean both takes into account masks. I've tried and got the value of "Out: 2.0"Lanni
@Lanni the new version probably fix the bug thanks for reportingAngst
B
13

In addition to the differences already noted, there's another extremely important difference that I just now discovered the hard way: unlike np.mean, np.average doesn't allow the dtype keyword, which is essential for getting correct results in some cases. I have a very large single-precision array that is accessed from an h5 file. If I take the mean along axes 0 and 1, I get wildly incorrect results unless I specify dtype='float64':

>T.shape
(4096, 4096, 720)
>T.dtype
dtype('<f4')

m1 = np.average(T, axis=(0,1))                #  garbage
m2 = np.mean(T, axis=(0,1))                   #  the same garbage
m3 = np.mean(T, axis=(0,1), dtype='float64')  # correct results

Unfortunately, unless you know what to look for, you can't necessarily tell your results are wrong. I will never use np.average again for this reason but will always use np.mean(.., dtype='float64') on any large array. If I want a weighted average, I'll compute it explicitly using the product of the weight vector and the target array and then either np.sum or np.mean, as appropriate (with appropriate precision as well).

Bik answered 19/5, 2020 at 17:23 Comment(1)
Very surprising. Do you know why this happens, and can you file a bug report? ThanksOceania
L
4

In your invocation, the two functions are the same.

average can compute a weighted average though.

Doc links: mean and average

Lionel answered 18/11, 2013 at 17:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.