Is Stream.findAny a short-circuit operation?
Asked Answered
D

5

5

Consider this code

Object found = collection.stream()
    .filter( s -> myPredicate1(s))
    .filter( s -> myPredicate2(s))
    .findAny()

Will it process entire stream, and call both myPredicate1 and myPredicate2 for all elements of the collection? Or will as many predicates be called as are needed to actually find the value?

Dibucaine answered 25/5, 2017 at 12:3 Comment(4)
You can find the answer in the docs: docs.oracle.com/javase/8/docs/api/java/util/stream/….Raoul
A more appropriate title might be: "Is filter + findAny still a short-circuit operation?".Pyrogenous
@JornVernee not really, it would work for all kinds of stream operations. As soon as 1st value reaches the findAny() it is returned and the stream ceases to operate.Dibucaine
I was caught by a second filter not firing. An intermediate peek() created a scratch list for use by the second predicate. Why did the scratch list never have more than one value? Building a scratch list using peek() is stateful and non-deterministic: unhelpful!Genital
D
8

Yes it is, as the Stream.findAny() documentation states:

This is a short-circuiting terminal operation.

It's a common misconception that objects in stream are "pushed" towards consuming operation. It's actually the other way around - the consuming operation pulls each element.

For sequential streams only as many predicates will be called as are needed to find matching value. Parallel streams may execute more predicates, but will also stop execution as soon as an element is found.

public class StreamFilterLazyTest {

  static int stI = 0;

  static class T { 

    public T() {
      super();
      this.i = ++stI;
    }

    int i;

    int getI() {
      System.err.println("getI: "+i);
      return i;
    }
  }

  public static void main(String[] args) {
    T[] arr = {new T(), new T(), new T(), new T(), new T(), new T(), new T(), new T(), new T(), new T()};
    Optional<T> found = Arrays.stream(arr).filter(t -> t.getI() == 3).findAny();
    System.out.println("Found: "+found.get().getI());
  }
}

will print:

getI: 1
getI: 2
getI: 3
Found: 3
Dibucaine answered 25/5, 2017 at 12:3 Comment(3)
Well, actually, elements are pushed towards the consuming operation in most cases, but in the case of short-circuiting operations, it happens one-at-a-time, on the stream’s request, via tryAdvance, whereas an operation knowing that it will consume all elements may use forEachRemaining instead. The important point is the laziness of intermediate operations.Malaco
@Malaco they are indeed pushed in terms of API used, there are Consumers used instead of Suppliers. But no operation will be performed unless there is a final consuming operation, hence my note about pulling not pushing. I've tried - but failed - to rephrase it. If you can think of a way to rephrase it so that it is both understandable and technically 100% correct, please do so.Dibucaine
Your description of the terminal operation as 'pulling' is very useful and is a much better way of describing than being 'pushed' from some source.Cumulate
C
4

The javadoc for findAny() states:

"This is a short-circuiting terminal operation."

"The behavior of this operation is explicitly nondeterministic; it is free to select any element in the stream. This is to allow for maximal performance in parallel operations ..."

This means that findAny() on a sequential stream will only "pull" enough elements to find the first one. On a parallel stream, it could pull more than enough, depending on the implementation.

The package javadoc also states:

"Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed."

This means that the filter() predicates only occur when the findAny() terminal pulls them.

In short:

Q: Is filter + findAny still a short-circuit operation?

A: Yes.

Culex answered 25/5, 2017 at 12:24 Comment(0)
O
2

Well it does not matter if sequential or parallel streams are used, they are still going to traverse as many elements as are required to find the first that matches. It might be different if you use findFirst and you have a Stream made of an ordered collection.

findFirst in this case has to preserver the order.

In this case, due to parallelism, the second, then third elements might be processed before the first, but still only the first will be returned.

Osteotomy answered 25/5, 2017 at 12:8 Comment(1)
And some elements may be processed after 3rd is found. It's still "lazy", though.Dibucaine
B
0

Stream#findAny is a short-circuiting terminal operation. it will visit Predicates to matching & short-circuited one by one since Stream#filter return a new stream each time.

Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.

As @Holger mentioned in comment that it can makes filters be short-circuited as below:

if(predicate1.test(value) && predicate2.test(value)){
 ....
}

Test

Iterator<Predicate<Integer>> predicates = Stream.<Predicate<Integer>>of(
        it -> false,
        it -> {
            throw new AssertionError("Can't be short-circuited!");
        }
).iterator();

Predicate<Integer> expectsToBeShortCircuited = it -> predicates.next().test(it);

Stream.of(1).filter(expectsToBeShortCircuited).filter(expectsToBeShortCircuited)
      //                                       |
      //                                       |
      //                      here is short-circuited since the stream is empty now
      .findAny();
Barb answered 26/5, 2017 at 14:9 Comment(1)
When you chain two filter operations, it has the semantic of an && operator, so it makes no sense to compare it with an || operator. If your first predicate is it -> false, you will see that it does work short-circuited, the same way as if(predicate1.test(value) && predicate2.test(value)) would do.Malaco
A
0

You can use peek to verify this

== Sequential ==

Alpha1 Alpha2 Beta1 Beta2 Gamma1 Gamma2 Dolphin1 Fargo1 Fargo2 Found: Fargo Applications: 9

== Parallel ==
Arnold1 Jim1 Loke1 Alpha1 Mustard1 Lenny1 Mustard2 Mark1 Alpha2 Mark2 Beta1 Beta2 Gamma1 Fargo1 Gamma2 Dolphin1 Fargo2 Found: Fargo Applications: 17

YMMV depending on number of cores etc.

Produced by below

package test.test;

import java.util.Optional;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Stream;

public class Snippet {

    static AtomicInteger predicateApplications;

    public static void main(String arr[]) {
        System.out.println("== Sequential == \n");
        sequential();
        System.out.println(" == Parallel == \n");
        parallel();
    }

    private static void sequential() {
        Stream<String> stream = Stream.of("Alpha", "Beta", "Gamma", "Dolphin", "Fargo", "Mustard", "Lenny", "Mark",
                "Jim", "Arnold", "Loke");
        execute(stream);
    }

    private static void parallel() {
        Stream<String> parallelStream = Stream
                .of("Alpha", "Beta", "Gamma", "Dolphin", "Fargo", "Mustard", "Lenny", "Mark", "Jim", "Arnold", "Loke")
                .parallel();
        execute(parallelStream);
    }

    private static void execute(Stream<String> stream) {
        predicateApplications = new AtomicInteger(0);

        Optional<String> findAny = stream.peek(s -> print(s + "1")).filter(s -> s.contains("a"))
                .peek(s -> print(s + "2")).filter(s -> s.startsWith("F")).findAny();

        String found = findAny.orElse("NONE");
        System.out.println("\nFound: " + found);
        System.out.println("Applications: " + predicateApplications.get());
    }

    private static void print(String s) {
        System.out.print(s + " ");
        predicateApplications.incrementAndGet();
    }

}
Acoustician answered 30/5, 2017 at 10:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.