How to use Java 8 streams to find all values preceding a larger value?
Asked Answered
N

7

38

Use Case

Through some coding Katas posted at work, I stumbled on this problem that I'm not sure how to solve.

Using Java 8 Streams, given a list of positive integers, produce a list of integers where the integer preceded a larger value.

[10, 1, 15, 30, 2, 6]

The above input would yield:

[1, 15, 2]

since 1 precedes 15, 15 precedes 30, and 2 precedes 6.

Non-Stream Solution

public List<Integer> findSmallPrecedingValues(final List<Integer> values) {

    List<Integer> result = new ArrayList<Integer>();
    for (int i = 0; i < values.size(); i++) {
        Integer next = (i + 1 < values.size() ? values.get(i + 1) : -1);
        Integer current = values.get(i);
        if (current < next) {
            result.push(current);
        }
    }
    return result;
}

What I've Tried

The problem I have is I can't figure out how to access next in the lambda.

return values.stream().filter(v -> v < next).collect(Collectors.toList());

Question

  • Is it possible to retrieve the next value in a stream?
  • Should I be using map and mapping to a Pair in order to access next?
Napolitano answered 7/5, 2015 at 0:8 Comment(1)
Take a look at https://mcmap.net/q/55720/-collect-successive-pairs-from-a-streamMoriahmoriarty
I
36

Using IntStream.range:

static List<Integer> findSmallPrecedingValues(List<Integer> values) {
    return IntStream.range(0, values.size() - 1)
        .filter(i -> values.get(i) < values.get(i + 1))
        .mapToObj(values::get)
        .collect(Collectors.toList());
}

It's certainly nicer than an imperative solution with a large loop, but still a bit meh as far as the goal of "using a stream" in an idiomatic way.

Is it possible to retrieve the next value in a stream?

Nope, not really. The best cite I know of for that is in the java.util.stream package description:

The elements of a stream are only visited once during the life of a stream. Like an Iterator, a new stream must be generated to revisit the same elements of the source.

(Retrieving elements besides the current element being operated on would imply they could be visited more than once.)

We could also technically do it in a couple other ways:

  • Statefully (very meh).
  • Using a stream's iterator is technically still using the stream.
Ivon answered 7/5, 2015 at 0:26 Comment(3)
I don't think this is "meh" at all. I think this is the best way to do it. Many people think of streams as processing values in left-to-right order; this leads to sequential, state-mutation thinking. It's also prone to boundary errors. You can see this in the conventional looping approach posted by the OP. By contrast, a stream over list indexes as you've done follows a vector- or array-based style of programming. A computation is done for each index i, and you can easily prove the result is correct for all i. There is no mutation or order dependency, thus it parallelizes well. +1Declared
@StuartMarks It's meh in the sense that we aren't streaming over the elements. But you're right, that probably has more to do with the way I'm thinking about it. My "ideal" solution would also be non-capturing.Ivon
@StuartMarks Lol, they're called "streams" because they're supposed to embody the concept of "processing values in left-to-right order." The point is you use just a small amount of memory and don't hold all of the data in memory at once. (Like, when you "stream" a song.) Of course, Java 8 morphed the definition entirely to make it be "a stateless process that's easy to parallelize," which is why you find nothing ironic about "using streams" to index into a big List.Generalissimo
W
13

That's not a pure Java8, but recently I've published a small library called StreamEx which has a method exactly for this task:

// Find all numbers where the integer preceded a larger value.
Collection<Integer> numbers = Arrays.asList(10, 1, 15, 30, 2, 6);
List<Integer> res = StreamEx.of(numbers).pairMap((a, b) -> a < b ? a : null)
    .nonNull().toList();
assertEquals(Arrays.asList(1, 15, 2), res);

The pairMap operation internally implemented using custom spliterator. As a result you have quite clean code which does not depend on whether the source is List or anything else. Of course it works fine with parallel stream as well.

Committed a testcase for this task.

Woad answered 7/5, 2015 at 1:44 Comment(2)
That's good. Maybe you could add this as an answer to this question: #20470510Generalissimo
Why not. Wrote a new answer.Woad
V
10

It's not a one-liner (it's a two-liner), but this works:

List<Integer> result = new ArrayList<>();
values.stream().reduce((a,b) -> {if (a < b) result.add(a); return b;});

Rather than solving it by "looking at the next element", this solves it by "looking at the previous element, which reduce() give you for free. I have bent its intended usage by injecting a code fragment that populates the list based on the comparison of previous and current elements, then returns the current so the next iteration will see it as its previous element.


Some test code:

List<Integer> result = new ArrayList<>();
IntStream.of(10, 1, 15, 30, 2, 6).reduce((a,b) -> {if (a < b) result.add(a); return b;});
System.out.println(result);

Output:

[1, 15, 2]
Velour answered 7/5, 2015 at 1:43 Comment(6)
Now you just converted the stream to the sequential (which actually doesn't change anything as Collection.stream() already returns a sequential stream). I mean that you cannot take the advantage of parallel computations using this approach.Woad
Even when using a sequential stream, the streams library can still do a tree reduction instead of the linear reduction you're relying on here.Songful
@JeffreyBosboom I'm tempted to integrate some code that does a tree reduction in the sequential case, just to break everybody's non-associative reducer functions. Sorely tempted.Declared
@JeffreyBosboom, as far as I know currently tree reduction is not used in JDK for sequential streams (including Java 9 trunk code). But you're right, you cannot rely on this. My StreamEx library has a foldLeft method which works strictly left-to-right even for parallel streams.Woad
@tag that's my understanding - as a matter of practicality, the reduction is sequential, even for parallel streamsVelour
@Velour Is it sequantial even for parallel streams? I took your solution added parallel() to it and the result because the associativity is broken is always different...Schulze
O
6

The accepted answer works fine if either the stream is sequential or parallel but can suffer if the underlying List is not random access, due to multiple calls to get.

If your stream is sequential, you might roll this collector:

public static Collector<Integer, ?, List<Integer>> collectPrecedingValues() {
    int[] holder = {Integer.MAX_VALUE};
    return Collector.of(ArrayList::new,
            (l, elem) -> {
                if (holder[0] < elem) l.add(holder[0]);
                holder[0] = elem;
            },
            (l1, l2) -> {
                throw new UnsupportedOperationException("Don't run in parallel");
            });
}

and a usage:

List<Integer> precedingValues = list.stream().collect(collectPrecedingValues());

Nevertheless you could also implement a collector so thats works for sequential and parallel streams. The only thing is that you need to apply a final transformation, but here you have control over the List implementation so you won't suffer from the get performance.

The idea is to generate first a list of pairs (represented by a int[] array of size 2) which contains the values in the stream sliced by a window of size two with a gap of one. When we need to merge two lists, we check the emptiness and merge the gap of the last element of the first list with the first element of the second list. Then we apply a final transformation to filter only desired values and map them to have the desired output.

It might not be as simple as the accepted answer, but well it can be an alternative solution.

public static Collector<Integer, ?, List<Integer>> collectPrecedingValues() {
    return Collectors.collectingAndThen(
            Collector.of(() -> new ArrayList<int[]>(),
                    (l, elem) -> {
                        if (l.isEmpty()) l.add(new int[]{Integer.MAX_VALUE, elem});
                        else l.add(new int[]{l.get(l.size() - 1)[1], elem});
                    },
                    (l1, l2) -> {
                        if (l1.isEmpty()) return l2;
                        if (l2.isEmpty()) return l1;
                        l2.get(0)[0] = l1.get(l1.size() - 1)[1];
                        l1.addAll(l2);
                        return l1;
                    }), l -> l.stream().filter(arr -> arr[0] < arr[1]).map(arr -> arr[0]).collect(Collectors.toList()));
}

You can then wrap these two collectors in a utility collector method, check if the stream is parallel with isParallel an then decide which collector to return.

Odellodella answered 7/5, 2015 at 7:44 Comment(6)
When I looked up your profile, I had exactly 6666 profile views ... that must mean something ... I don't know if this answer (the code) is correct, I can't understand/follow your code, but by your explanation, which I do understand, I know it is the best. I was trying for two days the same, but my mind blew up. +1Devour
Actually your parallel version does not have any benefits. The actual algorithm which is specified in the last line will be executed sequentially only after everything else is completed. Even if you replace the stream() with parallelStream() there, I believe, this solution will be quite slow.Woad
Just profiled all the supplied versions on randomly created input for 10k, 100k and 1M numbers using 4 CPU box. Your parallel version is the slowest among all (2-3x slower than your sequential version depending on input size). Here's the gist with benchmark core and results (requires JMH and StreamEx library for StreamEx tests)Woad
@TagirValeev This is not surprising because you need to re-filter all the input at once. Your library is likely more performant because you implemented your own spliterator (which requires also more code than my 15 lines solution), so I wouldn't call this a fair comparison :-), although if you need performance this is probably the way to go. I'm not surprised that the parallel version is slower because of the collectingAndThen operation. I really don't have the time to test now, but compared to the actual answer I'm curious to see the results if the input was a LinkedList.Odellodella
Surely the accepted answer will perform poorly for lists not supporting random access and will not work at all for any other stream source (for example, BufferedReader, concatenation of streams, etc.). Using my library you have two advantages: it works nicely with any stream source and parallelization really improves the speed. And the disadvantage: you have to use more code. If fastest and flexible solution were also the short one I would not write this library at all :-)Woad
@TagirValeev Would be interesting to put those benchmarks somewhere more visible on this Q&A. Not because we are the winners :*(, clearly java.util.stream is the winner.Ivon
E
4

If you're willing to use a third party library and don't need parallelism, then jOOλ offers SQL-style window functions as follows

System.out.println(
Seq.of(10, 1, 15, 30, 2, 6)
   .window()
   .filter(w -> w.lead().isPresent() && w.value() < w.lead().get())
   .map(w -> w.value())
   .toList()
);

Yielding

[1, 15, 2]

The lead() function accesses the next value in traversal order from the window.

Disclaimer: I work for the company behind jOOλ

Exemplum answered 6/1, 2016 at 23:14 Comment(1)
You can use Window::value instead of w -> w.value().Infinitude
D
2

You can achieve that by using a bounded queue to store elements which flows through the stream (which is basing on the idea which I described in detail here: Is it possible to get next element in the Stream?

Belows example first defines instance of BoundedQueue class which will store elements going through the stream (if you don't like idea of extending the LinkedList, refer to link mentioned above for alternative and more generic approach). Later you just examine the two subsequent elements - thanks to the helper class:

public class Kata {
  public static void main(String[] args) {
    List<Integer> input = new ArrayList<Integer>(asList(10, 1, 15, 30, 2, 6));

    class BoundedQueue<T> extends LinkedList<T> {
      public BoundedQueue<T> save(T curElem) {
        if (size() == 2) { // we need to know only two subsequent elements
          pollLast(); // remove last to keep only requested number of elements
        }

        offerFirst(curElem);
        return this;
      }

      public T getPrevious() {
        return (size() < 2) ? null : getLast();
      }

      public T getCurrent() {
        return (size() == 0) ? null : getFirst();
      }
    }

    BoundedQueue<Integer> streamHistory = new BoundedQueue<Integer>();

    final List<Integer> answer = input.stream()
      .map(i -> streamHistory.save(i))
      .filter(e -> e.getPrevious() != null)
      .filter(e -> e.getCurrent() > e.getPrevious())
      .map(e -> e.getPrevious())
      .collect(Collectors.toList());

    answer.forEach(System.out::println);
  }
}
Dejong answered 8/11, 2016 at 12:6 Comment(0)
D
-2
List<Integer> listOne= List.of(12,13,67,56,90,62);

listOne.stream().filter(val->val.intValue()>50).collect(Collectors.toList()).forEach(System.out::println);
Doornail answered 6/10, 2023 at 7:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.