Columnwise sum of array - two methods, two different results

In this example, the column-wise sum of an array pr is computed in two different ways:

(a) take the sum over the first axis using p.sum's axis parameter

(b) slice the array along the the second axis and take the sum of each slice

import matplotlib.pyplot as plt
import numpy as np


m = 100
n = 2000
x = np.random.random_sample((m, n))

X = np.abs(np.fft.rfft(x)).T
frq = np.fft.rfftfreq(n)

total = X.sum(axis=0)
c = frq @ X / total

df = frq[:, None] - c
pr = df * X


a = np.sum(pr, axis=0)
b = [np.sum(pr[:, i]) for i in range(m)]

fig, ax = plt.subplots(1)
ax.plot(a)
ax.plot(b)
plt.show()

Both methods should return the same, but for whatever reason, in this example, they do not. As you can see in the plot below, a and b have totally different values. The difference is, however, so small that np.allclose(a, b) is True.

If you replace pr with some small random values, there is no difference between the two summation methods:

pr = np.random.randn(n, m) / 1e12
a = np.sum(pr, axis=0)
b = np.array([np.sum(pr[:, i]) for i in range(m)])

fig, ax = plt.subplots(1)
ax.plot(a)
ax.plot(b)
plt.show()

The second example indicates that the differences in the sums of the first example are not related to the summation methods. Then, is this a problem relate to floating point value summation? If so, why doesn't such an effect occure in the second example?

Why do the colum-wise sums differ in the first example, and which one is correct?

Recommended topics

Hot tags