IntStream leads to array elements being wrongly set to 0 (JVM Bug, Java 11)
Asked Answered
S

1

21

In the class P below, the method test seems to return identically false:

import java.util.function.IntPredicate;
import java.util.stream.IntStream;

public class P implements IntPredicate {
    private final static int SIZE = 33;

    @Override
    public boolean test(int seed) {
        int[] state = new int[SIZE];
        state[0] = seed;
        for (int i = 1; i < SIZE; i++) {
            state[i] = state[i - 1];
        }
        return seed != state[SIZE - 1];
    }

    public static void main(String[] args) {
        long count = IntStream.range(0, 0x0010_0000).filter(new P()).count();
        System.out.println(count);
    }
}

Combining class P with IntStream, however, the method test can (wrongly) return true. The code in the main method above results in some positive integer, like 716208. The result changes after every execution.

This unexpected behavior occurs because the int array state[] can be set to zero during the execution. If a testing code, such as

if (seed == 0xf_fff0){
    System.out.println(Arrays.toString(state));
} 

is inserted at the tail of the method test, then the program will output a line like [1048560, 1048560, 1048560, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0].

Question: Why can the int array state[] be set to zero?

I already know how to avoid this behavior: just replacing int[] with ArrayList.

I examined in:

  • windows 10 + and debian 10+ with OpenJDK Runtime Environment (build 15.0.1+9-18) OpenJDK 64-Bit Server VM (build 15.0.1+9-18, mixed mode, sharing)
  • debian 9 + OpenJDK Runtime Environment AdoptOpenJDK (build 13.0.1+9) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 13.0.1+9, mixed mode, sharing)
Sleeve answered 21/12, 2020 at 14:51 Comment(0)
A
24

One can reproduce the problem with an even simpler example, namely:

class Main {
    private final static int SIZE = 33;

    public static boolean test2(int seed) {
        int[] state = new int[SIZE];
        state[0] = seed;
        for (int i = 1; i < SIZE; i++) {
            state[i] = state[i - 1];
        }
        return seed != state[SIZE - 1];
    }

    public static void main(String[] args) {
        long count = IntStream.range(0, 0x0010_0000).filter(Main::test2).count();
        System.out.println(count);

    }
}

The problem is caused by the JVM optimization flag that allows the vectorization (SIMD) of loops (i.e., -XX:+AllowVectorizeOnDemand). Likely arises from applying vectorization on the same array with intersecting ranges (i.e., state[i] = state[i - 1];). A similar issue would be possible to reproduce if the JVM would (for some of the elements of the IntStream.range(0, 0x0010_0000)), optimize the loop:

   for (int i = 1; i < SIZE; i++)
       state[i] = state[i - 1];

into:

    System.arraycopy(state, 0, state, 1, SIZE - 1);

For instance:

class Main {
    private final static int SIZE = 33;

    public static boolean test2(int seed) {
        int[] state = new int[SIZE];
        state[0] = seed;
        System.arraycopy(state, 0, state, 1, SIZE - 1);
        if(seed == 100)
           System.out.println(Arrays.toString(state));
        return seed != state[SIZE - 1];
    }

    public static void main(String[] args) {
        long count = IntStream.range(0, 0x0010_0000).filter(Main::test2).count();
        System.out.println(count);
    }
}

output:

[100, 100, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

NEW UPDATE: 01/01/2021

I have sent an email to one of the Developers involved in the implementation/integration of that flag -XX:+AllowVectorizeOnDemandand received the following reply:

It is known that part of AllowVectorizeOnDemand code is broken.

There was fix (it excluded executing broken code which does incorrect vectorization) which was backported into jdk 11.0.11:

https://hg.openjdk.java.net/jdk-updates/jdk11u-dev/rev/69dbdd271e04

If you can, try build and test latest OpenJDK11u from https://hg.openjdk.java.net/jdk-updates/jdk11u-dev/

From the first link, one can read the following:

@bug 8251994 @summary Test vectorization of Streams$RangeIntSpliterator::forEachRemaining @requires vm.compiler2.enabled & vm.compMode != "Xint"

@run main compiler.vectorization.TestForEachRem test1 @run main compiler.vectorization.TestForEachRem test2 @run main compiler.vectorization.TestForEachRem test3 @run main compiler.vectorization.TestForEachRem test4

From the comments on the JIRA story on that bug, one can read:

I found the cause of the issue. To improve a chance to vectorize a loop, superword tries to hoist loads to the beginning of loop by replacing their memory input with corresponding (same memory slice) loop's memory Phi : http://hg.openjdk.java.net/jdk/jdk/file/8f73aeccb27c/src/hotspot/share/opto/superword.cpp#l471

Originally loads are ordered by corresponding stores on the same memory slice. But when they are hoisted they loose that ordering - nothing enforce the order. In test6 case the ordering is preserved (luckily?) after hoisting only when vector size is 32 bytes (avx2) but they become unordered with 16 (avx=0 or avx1) or 64 (avx512) bytes vectors. (...)

I have simple fix (use original loads ordering indexes) but looking on the code which causing the issue I see that it is bogus/incomplete - it does not help cases listed for JDK-8076284 changes:

https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017645.html

Using unrolling and cloning information to vectorize is interesting idea but as I see it is not complete. Even if pack_parallel() method is able created packs they are all removed by filter_packs() method. And additionally the above cases are vectorized without hoisting loads and pack_parallel - I verified it. That code is useless now and I will put it under flag to not run it. It needs more work to be useful. I reluctant to remove the code because may be in a future we will have time to invest into it.

This might explain why when I was comparing the assembly of the versions with and without the flag -XX:+AllowVectorizeOnDemand, I notice that the version with the flag for the following code:

   for (int i = 1; i < SIZE; i++)
       state[i] = state[i - 1];

(that I extract on a method called hotstop to facilitate looking for it in the assembly), had:

00000001162bacf5: mov    %r8d,0x10(%rsi,%r10,4)
0x00000001162bacfa: mov    %r8d,0x14(%rsi,%r10,4)
0x00000001162bacff: mov    %r8d,0x18(%rsi,%r10,4)
0x00000001162bad04: mov    %r8d,0x1c(%rsi,%r10,4)
0x00000001162bad09: mov    %r8d,0x20(%rsi,%r10,4)
0x00000001162bad0e: mov    %r8d,0x24(%rsi,%r10,4)
0x00000001162bad13: mov    %r8d,0x28(%rsi,%r10,4)
0x00000001162bad18: mov    %r8d,0x2c(%rsi,%r10,4)  ;*iastore {reexecute=0 rethrow=0 return_oop=0}
                                             ; - AAAAAA.Main::hotstop@15 (line 21)

Which looks to me like a loop unrolling, a side from that, the method java.util.stream.Streams$RangeIntSpliterator::forEachRemaining showed up only in the assembly of the version with the flag.

Avila answered 22/12, 2020 at 20:32 Comment(1)
Thank you @Avila for your answer. The disabling option -XX:-AllowVectorizeOnDemand worked correctly and the output was what I expected. Thank you.Sleeve

© 2022 - 2024 — McMap. All rights reserved.