Columnwise sum of array - two methods, two different results
Asked Answered
S

1

2

In this example, the column-wise sum of an array pr is computed in two different ways:

(a) take the sum over the first axis using p.sum's axis parameter

(b) slice the array along the the second axis and take the sum of each slice

import matplotlib.pyplot as plt
import numpy as np


m = 100
n = 2000
x = np.random.random_sample((m, n))

X = np.abs(np.fft.rfft(x)).T
frq = np.fft.rfftfreq(n)

total = X.sum(axis=0)
c = frq @ X / total

df = frq[:, None] - c
pr = df * X


a = np.sum(pr, axis=0)
b = [np.sum(pr[:, i]) for i in range(m)]

fig, ax = plt.subplots(1)
ax.plot(a)
ax.plot(b)
plt.show()

Both methods should return the same, but for whatever reason, in this example, they do not. As you can see in the plot below, a and b have totally different values. The difference is, however, so small that np.allclose(a, b) is True.

Plot <code>a</code> and <code>b</code>

If you replace pr with some small random values, there is no difference between the two summation methods:

pr = np.random.randn(n, m) / 1e12
a = np.sum(pr, axis=0)
b = np.array([np.sum(pr[:, i]) for i in range(m)])

fig, ax = plt.subplots(1)
ax.plot(a)
ax.plot(b)
plt.show()

Plot of <code>a</code> and <code>b</code> set with random values

The second example indicates that the differences in the sums of the first example are not related to the summation methods. Then, is this a problem relate to floating point value summation? If so, why doesn't such an effect occure in the second example?

Why do the colum-wise sums differ in the first example, and which one is correct?

Senhorita answered 2/4, 2019 at 12:0 Comment(1)
Does numpy sum guarantee the order of summation?Diversified
S
4

For why the results are different, see https://mcmap.net/q/332448/-what-causes-different-in-array-sum-along-axis-for-c-versus-f-ordered-arrays-in-numpy. The slice case uses pairwise summation, the axis case doesn't.

Which one is correct? Well, probably neither, but pairwise summation is expected to be more accurate.

Indeed, we can see that it is fairly close to the exact (within machine precision) result obtained using math.fsum.

enter image description here

Stagner answered 2/4, 2019 at 20:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.