The relevant issue is JDK-8214761: Bug in parallel Kahan summation implementation
Since it is mentioned in this bug report that DoubleSummaryStatistics
is affected as well, we can construct an example that eliminates all other influences:
public class Main {
public static void main(String[] args) {
DoubleSummaryStatistics s = new DoubleSummaryStatistics();
s.accept(27.19);
s.accept(18.97);
s.accept(6.44);
s.accept(106.36);
System.out.println(System.getProperty("java.version")+": "+s.getAverage());
}
}
which I used to produce
1.8.0_162: 39.74000000000001
17: 39.74000000000001
(with the release version of Java 17)
and
17.0.2: 39.739999999999995
which matches the version of the backport of the fix.
Generally, the contract of the method says that the result does not have to match the result of just adding the values and dividing by the size. There’s the implementation’s freedom to provide an error correction but it’s also important to keep in mind that floating point addition is not strictly associative but we have to treat it as associative to be able to support parallel processing.
We may even verify that the change is an improvement:
DoubleSummaryStatistics s = new DoubleSummaryStatistics();
s.accept(27.19);
s.accept(18.97);
s.accept(6.44);
s.accept(106.36);
double average = s.getAverage();
System.out.println(System.getProperty("java.version") + ": " + average);
BigDecimal d = new BigDecimal("27.19");
d = d.add(new BigDecimal("18.97"));
d = d.add(new BigDecimal("6.44"));
d = d.add(new BigDecimal("106.36"));
BigDecimal realAverage = d.divide(BigDecimal.valueOf(4), MathContext.UNLIMITED);
System.out.println("actual: " + realAverage
+ ", error: " + realAverage.subtract(BigDecimal.valueOf(average)).abs());
which prints, e.g.
1.8.0_162: 39.74000000000001
actual: 39.74, error: 1E-14
17.0.2: 39.739999999999995
actual: 39.74, error: 5E-15
Note that this is the error of the decimal representations as printed. If you want to know how close the actual double
representation is to the correct value, you have to replace BigDecimal.valueOf(average)
with new BigDecimal(average)
. Then, the difference between the errors is a bit less, however, the new algorithm is closer to the correct value for both.
average
(or streams in general) changed between versions. I would try to reproduce the problem without any other APIs - just with plaindouble
operations. – Savanna