np.mean() vs np.average() in Python NumPy?

Asked 18/11, 2013 at 17:43 Answered 19/5, 2020 at 17:23

Solved python numpy statistics average mean

261

I notice that

In [30]: np.mean([1, 2, 3])
Out[30]: 2.0

In [31]: np.average([1, 2, 3])
Out[31]: 2.0

However, there should be some differences, since after all they are two different functions.

What are the differences between them?

Protolanguage answered 18/11, 2013 at 17:43 Comment(4)

Actually, the documentation doesn't make it immediately clear, as far as I can see. Not saying it is impossible to tell, but I think this question is valid for Stack Overflow all the same. – Reagan 18/11, 2013 at 17:47

numpy.mean : Returns the average of the array elements. – Tarsus 18/11, 2013 at 17:47

@joaquin: "Compute the arithmetic mean along the specified axis." vs "Compute the weighted average along the specified axis."? – Brittenybrittingham 19/11, 2013 at 0:1

@Brittenybrittingham right. I was just trying to make a kind of funny response to your comment because if I follow your instructions the first thing I read in the docs for numpy.mean is numpy.mean : Returns the average of the array elements which is funny if you are looking for the answer to the OP question. – Tarsus 19/11, 2013 at 16:5

236

np.average takes an optional weight parameter. If it is not supplied they are equivalent. Take a look at the source code: Mean, Average

np.mean:

try:
    mean = a.mean
except AttributeError:
    return _wrapit(a, 'mean', axis, dtype, out)
return mean(axis, dtype, out)

np.average:

...
if weights is None :
    avg = a.mean(axis)
    scl = avg.dtype.type(a.size/avg.size)
else:
    #code that does weighted mean here

if returned: #returned is another optional argument
    scl = np.multiply(avg, 0) + scl
    return avg, scl
else:
    return avg
...

Chesser answered 18/11, 2013 at 17:51 Comment(3)

Why do they offer two different functions? Seems they should just offer np.average since weights is already optional. Seems unnecessary and only serves to confuse users. – Precentor 30/11, 2015 at 22:3

@Precentor I would rather have them throw a NotImplementedException for "average", to educate users that the arithmetic mean is not identical to "the average". – Pyrogenous 26/6, 2018 at 11:15

@Precentor this and that answers actually tell you why there is a need for these two functions. – Ayers 8/11, 2023 at 18:14

np.mean always computes an arithmetic mean, and has some additional options for input and output (e.g. what datatypes to use, where to place the result).

np.average can compute a weighted average if the weights parameter is supplied.

Hylan answered 18/11, 2013 at 17:50 Comment(0)

In some version of numpy there is another imporant difference that you must be aware:

average do not take in account masks, so compute the average over the whole set of data.

mean takes in account masks, so compute the mean only over unmasked values.

g = [1,2,3,55,66,77]
f = np.ma.masked_greater(g,5)

np.average(f)
Out: 34.0

np.mean(f)
Out: 2.0

Angst answered 5/8, 2016 at 7:40 Comment(3)

Note: np.ma.average works. Also, there is a bug report. – Monophyletic 29/3, 2017 at 1:53

np.average and np.mean both takes into account masks. I've tried and got the value of "Out: 2.0" – Lanni 30/6, 2022 at 14:40

@Lanni the new version probably fix the bug thanks for reporting – Angst 30/6, 2022 at 16:42

In addition to the differences already noted, there's another extremely important difference that I just now discovered the hard way: unlike np.mean, np.average doesn't allow the dtype keyword, which is essential for getting correct results in some cases. I have a very large single-precision array that is accessed from an h5 file. If I take the mean along axes 0 and 1, I get wildly incorrect results unless I specify dtype='float64':

>T.shape
(4096, 4096, 720)
>T.dtype
dtype('<f4')

m1 = np.average(T, axis=(0,1))                #  garbage
m2 = np.mean(T, axis=(0,1))                   #  the same garbage
m3 = np.mean(T, axis=(0,1), dtype='float64')  # correct results

Unfortunately, unless you know what to look for, you can't necessarily tell your results are wrong. I will never use np.average again for this reason but will always use np.mean(.., dtype='float64') on any large array. If I want a weighted average, I'll compute it explicitly using the product of the weight vector and the target array and then either np.sum or np.mean, as appropriate (with appropriate precision as well).

Bik answered 19/5, 2020 at 17:23 Comment(1)

Very surprising. Do you know why this happens, and can you file a bug report? Thanks – Oceania 22/9, 2020 at 13:48

In your invocation, the two functions are the same.

average can compute a weighted average though.

Doc links: mean and average

Lionel answered 18/11, 2013 at 17:50 Comment(0)

Recommended topics

Hot tags