Java 8 Stream mixing two elements
Asked Answered
S

7

9

I have a many objects of type Slot in an array list.

Slot class is as shown below-

Slot{
   int start;
   int end;
}

let the list of type List<Slot> be called slots. The slots are sorted based on start time. End time of one slot may be equal to start time of next slot, but they would never overlap.

Is there any possible way in which I can iterate over this list using Java 8 streams, and combine two slots if end time of one matches start time of next and output them into an ArrayList?

Skilling answered 24/9, 2015 at 21:43 Comment(4)
@Pshemo, "remember previous element" is not really an operation that works well for Streams. This problem would almost certainly be better solved using a straightforward loop.Antefix
@LouisWasserman Woops you are right. I missed "Java 8 streams" part (brain fart possibly caused by lack of proper tags in question - fixed now).Francoisefrancolin
You can probably do it using the reduce() method with U being another List<Slot>, but it's be a bit more convoluted that just doing it in a for loop, unless parallel processing is required.Dextrad
Is Stream API a requirement ? Have you take a look at Guava Ranges ?Grimm
G
5

Such scenario is perfectly supported by my free StreamEx library which enhances standard Stream API. There's an intervalMap intermediate operation which is capable to collapse several adjacent stream elements to the single element. Here's complete example:

// Slot class and sample data are taken from @Andreas answer
List<Slot> slots = Arrays.asList(new Slot(3, 5), new Slot(5, 7), 
                new Slot(8, 10), new Slot(10, 11), new Slot(11, 13));

List<Slot> result = StreamEx.of(slots)
        .intervalMap((s1, s2) -> s1.end == s2.start,
                     (s1, s2) -> new Slot(s1.start, s2.end))
        .toList();
System.out.println(result);
// Output: [3-7, 8-13]

The intervalMap method takes two parameters. The first is a BiPredicate accepting two adjacent elements from the input stream and returns true if they must be merged (here the condition is s1.end == s2.start). The second parameter is a BiFunction which takes the first and the last elements from the merged series and produces the resulting element.

Note that if you have, for example 100 adjacent slots which should be combined into one, this solution does not create 100 intermediate objects (like in @Misha's answer, which is nevertheless very interesting), it tracks first and last slot in the series immediately forgetting about intermediate onces. Of course this solution is parallel friendly. If you have many thousands of input slots, using .parallel() may improve the performance.

Note that current implementation will recreate the Slot even if it's not merged with anything. In this case the BinaryOperator receives the same Slot parameter twice. If you want to optimize this case, you can make additional check like s1 == s2 ? s1 : ...:

List<Slot> result = StreamEx.of(slots)
        .intervalMap((s1, s2) -> s1.end == s2.start,
                     (s1, s2) -> s1 == s2 ? s1 : new Slot(s1.start, s2.end))
        .toList();
Gales answered 25/9, 2015 at 2:17 Comment(1)
You shamed me into a proper impl...Guiltless
E
5

Since these types of questions come up a lot, I thought it might be an interesting exercise to write a collector that would group adjacent elements by a predicate.

Assuming we can add combining logic to the Slot class

boolean canCombine(Slot other) {
    return this.end == other.start;
}

Slot combine(Slot other) {
    if (!canCombine(other)) {
        throw new IllegalArgumentException();
    }
    return new Slot(this.start, other.end);
}

the groupingAdjacent collector can then be used as follows:

List<Slot> combined = slots.stream()
    .collect(groupingAdjacent(
        Slot::canCombine,         // test to determine if two adjacent elements go together
        reducing(Slot::combine),  // collector to use for combining the adjacent elements
        mapping(Optional::get, toList())  // collector to group up combined elements
    ));

Alternatively, second parameter can be collectingAndThen(reducing(Slot::combine), Optional::get) and the third argument be toList()

Here's the source for groupingAdjacent. It can handle null elements and is parallel-friendly. With a bit more hassle, a similar thing can be done with a Spliterator.

public static <T, AI, I, AO, R> Collector<T, ?, R> groupingAdjacent(
        BiPredicate<? super T, ? super T> keepTogether,
        Collector<? super T, AI, ? extends I> inner,
        Collector<I, AO, R> outer
) {
    AI EMPTY = (AI) new Object();

    // Container to accumulate adjacent possibly null elements.  Adj can be in one of 3 states:
    // - Before first element: curGrp == EMPTY
    // - After first element but before first group boundary: firstGrp == EMPTY, curGrp != EMPTY
    // - After at least one group boundary: firstGrp != EMPTY, curGrp != EMPTY
    class Adj {

        T first, last;     // first and last elements added to this container
        AI firstGrp = EMPTY, curGrp = EMPTY;
        AO acc = outer.supplier().get();  // accumlator for completed groups

        void add(T t) {
            if (curGrp == EMPTY) /* first element */ {
                first = t;
                curGrp = inner.supplier().get();
            } else if (!keepTogether.test(last, t)) /* group boundary */ {
                addGroup(curGrp);
                curGrp = inner.supplier().get();
            }
            inner.accumulator().accept(curGrp, last = t);
        }

        void addGroup(AI group) /* group can be EMPTY, in which case this should do nothing */ {
            if (firstGrp == EMPTY) {
                firstGrp = group;
            } else if (group != EMPTY) {
                outer.accumulator().accept(acc, inner.finisher().apply(group));
            }
        }

        Adj merge(Adj other) {
            if (other.curGrp == EMPTY) /* other is empty */ {
                return this;
            } else if (this.curGrp == EMPTY) /* this is empty */ {
                return other;
            } else if (!keepTogether.test(last, other.first)) /* boundary between this and other*/ {
                addGroup(this.curGrp);
                addGroup(other.firstGrp);
            } else if (other.firstGrp == EMPTY) /* other container is single-group. prepend this.curGrp to other.curGrp*/ {
                other.curGrp = inner.combiner().apply(this.curGrp, other.curGrp);
            } else /* other Adj contains a boundary.  this.curGrp+other.firstGrp form a complete group. */ {
                addGroup(inner.combiner().apply(this.curGrp, other.firstGrp));
            }
            this.acc = outer.combiner().apply(this.acc, other.acc);
            this.curGrp = other.curGrp;
            this.last = other.last;
            return this;
        }

        R finish() {
            AO combined = outer.supplier().get();
            if (curGrp != EMPTY) {
                addGroup(curGrp);
                assert firstGrp != EMPTY;
                outer.accumulator().accept(combined, inner.finisher().apply(firstGrp));
            }
            return outer.finisher().apply(outer.combiner().apply(combined, acc));
        }
    }
    return Collector.of(Adj::new, Adj::add, Adj::merge, Adj::finish);
}
Ebneter answered 25/9, 2015 at 1:31 Comment(0)
G
5

Such scenario is perfectly supported by my free StreamEx library which enhances standard Stream API. There's an intervalMap intermediate operation which is capable to collapse several adjacent stream elements to the single element. Here's complete example:

// Slot class and sample data are taken from @Andreas answer
List<Slot> slots = Arrays.asList(new Slot(3, 5), new Slot(5, 7), 
                new Slot(8, 10), new Slot(10, 11), new Slot(11, 13));

List<Slot> result = StreamEx.of(slots)
        .intervalMap((s1, s2) -> s1.end == s2.start,
                     (s1, s2) -> new Slot(s1.start, s2.end))
        .toList();
System.out.println(result);
// Output: [3-7, 8-13]

The intervalMap method takes two parameters. The first is a BiPredicate accepting two adjacent elements from the input stream and returns true if they must be merged (here the condition is s1.end == s2.start). The second parameter is a BiFunction which takes the first and the last elements from the merged series and produces the resulting element.

Note that if you have, for example 100 adjacent slots which should be combined into one, this solution does not create 100 intermediate objects (like in @Misha's answer, which is nevertheless very interesting), it tracks first and last slot in the series immediately forgetting about intermediate onces. Of course this solution is parallel friendly. If you have many thousands of input slots, using .parallel() may improve the performance.

Note that current implementation will recreate the Slot even if it's not merged with anything. In this case the BinaryOperator receives the same Slot parameter twice. If you want to optimize this case, you can make additional check like s1 == s2 ? s1 : ...:

List<Slot> result = StreamEx.of(slots)
        .intervalMap((s1, s2) -> s1.end == s2.start,
                     (s1, s2) -> s1 == s2 ? s1 : new Slot(s1.start, s2.end))
        .toList();
Gales answered 25/9, 2015 at 2:17 Comment(1)
You shamed me into a proper impl...Guiltless
D
4

You can do it using the reduce() method with U being another List<Slot>, but it's a lot more convoluted than just doing it in a for loop, unless parallel processing is required.

See end of answer for test setup.

Here is the for loop implementation:

List<Slot> mixed = new ArrayList<>();
Slot last = null;
for (Slot slot : slots)
    if (last == null || last.end != slot.start)
        mixed.add(last = slot);
    else
        mixed.set(mixed.size() - 1, last = new Slot(last.start, slot.end));

Output

[3-5, 5-7, 8-10, 10-11, 11-13]
[3-7, 8-13]

Here is the stream reduce implementation:

List<Slot> mixed = slots.stream().reduce((List<Slot>)null, (list, slot) -> {
    System.out.println("accumulator.apply(" + list + ", " + slot + ")");
    if (list == null) {
        List<Slot> newList = new ArrayList<>();
        newList.add(slot);
        return newList;
    }
    Slot last = list.get(list.size() - 1);
    if (last.end != slot.start)
        list.add(slot);
    else
        list.set(list.size() - 1, new Slot(last.start, slot.end));
    return list;
}, (list1, list2) -> {
    System.out.println("combiner.apply(" + list1 + ", " + list2 + ")");
    if (list1 == null)
        return list2;
    if (list2 == null)
        return list1;
    Slot lastOf1 = list1.get(list1.size() - 1);
    Slot firstOf2 = list2.get(0);
    if (lastOf1.end != firstOf2.start)
        list1.addAll(list2);
    else {
        list1.set(list1.size() - 1, new Slot(lastOf1.start, firstOf2.end));
        list1.addAll(list2.subList(1, list2.size()));
    }
    return list1;
});

Output

accumulator.apply(null, 3-5)
accumulator.apply([3-5], 5-7)
accumulator.apply([3-7], 8-10)
accumulator.apply([3-7, 8-10], 10-11)
accumulator.apply([3-7, 8-11], 11-13)
[3-5, 5-7, 8-10, 10-11, 11-13]
[3-7, 8-13]

Changing it for parallel (multi-threaded) processing:

List<Slot> mixed = slots.stream().parallel().reduce(...

Output

accumulator.apply(null, 8-10)
accumulator.apply(null, 3-5)
accumulator.apply(null, 10-11)
accumulator.apply(null, 11-13)
combiner.apply([10-11], [11-13])
accumulator.apply(null, 5-7)
combiner.apply([3-5], [5-7])
combiner.apply([8-10], [10-13])
combiner.apply([3-7], [8-13])
[3-5, 5-7, 8-10, 10-11, 11-13]
[3-7, 8-13]

Caveat

If slots is an empty list, the for loop version results in an empty list, and the streams version results is a null value.


Test Setup

All the above code used the following Slot class:

class Slot {
    int start;
    int end;
    Slot(int start, int end) {
        this.start = start;
        this.end = end;
    }
    @Override
    public String toString() {
        return this.start + "-" + this.end;
    }
}

The slots variable was defined as:

List<Slot> slots = Arrays.asList(new Slot(3, 5), new Slot(5, 7), new Slot(8, 10), new Slot(10, 11), new Slot(11, 13));

Both slots and the result mixed are printed using:

System.out.println(slots);
System.out.println(mixed);
Dextrad answered 24/9, 2015 at 22:45 Comment(2)
I haven't analyzed it, but according to @Guiltless it should be much simpler.Aedile
@Aedile You are right, Bohemian's version is much smaller, but it doesn't work for parallel processing. More than half my code is to support parallel processing.Dextrad
G
3

It's a two-liner:

List<Slot> condensed = new LinkedList<>();
slots.stream().reduce((a,b) -> {if (a.end == b.start) return new Slot(a.start, b.end); 
  condensed.add(a); return b;}).ifPresent(condensed::add);

If the fields of slot are not visible, you will have to change a.end to a.getEnd(), etc


Some test code with some edge cases:

List<List<Slot>> tests = Arrays.asList(
        Arrays.asList(new Slot(3, 5), new Slot(5, 7), new Slot(8, 10), new Slot(10, 11), new Slot(11, 13)),
        Arrays.asList(new Slot(3, 5), new Slot(5, 7), new Slot(8, 10), new Slot(10, 11), new Slot(12, 13)),
        Arrays.asList(new Slot(3, 5), new Slot(5, 7)),
        Collections.emptyList());
for (List<Slot> slots : tests) {
    List<Slot> condensed = new LinkedList<>();
    slots.stream().reduce((a, b) -> {if (a.end == b.start) return new Slot(a.start, b.end);
        condensed.add(a); return b;}).ifPresent(condensed::add);
    System.out.println(condensed);
}

Output:

[3-7, 8-13]
[3-7, 8-11, 12-13]
[3-7]
[]
Guiltless answered 25/9, 2015 at 3:57 Comment(3)
As always it should be noted that such solution violates the contract of the reduce method and is not parallel-friendly. Such solution would be more credible were foldLeft(accumulator) implemented in Stream API...Gales
@Tagir in this case maintaining order is critical, so it's never going to be parallelizable. But the "violation" trade off seems a good deal to simplify the task down to a single relatively trivial lambda. I like your solution though - I hadn't noticed intervalMap() before - nice!Guiltless
Actually all other solutions (not only mine) are parallel-friendly. Parallel stream still can preserve order, you just need to be able to split the task into independent parts (like first half of list, then second half), process them independently, then join the results together. My solution, though the most flexible as it works as intermediate operation, actually has really big piece of code under the hood. Without third-party libraries Misha and Holger solutions are fine, though they terminate the stream. Your one probably wins in "best abuse of the rules" category :-)Gales
G
3

A clean (parallel safe) solution that doesn't require any new methods:

List<Slot> condensed = slots.stream().collect(LinkedList::new,
  (l, s) -> l.add(l.isEmpty() || l.getLast().end != s.start ?
    s : new Slot(l.removeLast().start, s.end)),
  (l, l2) -> {if (!l.isEmpty() && !l2.isEmpty() && l.getLast().end == l2.getFirst().start) {
    l.add(new Slot(l.removeLast().start, l2.removeFirst().end));} l.addAll(l2);});

This uses the more appropriate list implementation LinkedList to simplify removing and accessing the last element of the list when merging Slots.


List<List<Slot>> tests = Arrays.asList(
            Arrays.asList(new Slot(3, 5), new Slot(5, 7), new Slot(8, 10), new Slot(10, 11), new Slot(11, 13)),
            Arrays.asList(new Slot(3, 5), new Slot(5, 7), new Slot(8, 10), new Slot(10, 11), new Slot(12, 13)),
            Arrays.asList(new Slot(3, 5), new Slot(5, 7)),
            Collections.emptyList());
for (List<Slot> slots : tests) {
    List<Slot> condensed = slots.stream().collect(LinkedList::new,
      (l, s) -> l.add(l.isEmpty() || l.getLast().end != s.start ?
        s : new Slot(l.removeLast().start, s.end)),
      (l, l2) -> {if (!l.isEmpty() && !l2.isEmpty() && l.getLast().end == l2.getFirst().start) {
        l.add(new Slot(l.removeLast().start, l2.removeFirst().end));} l.addAll(l2);});
    System.out.println(condensed);
}

Output:

[[3, 7], [8, 13]]
[[3, 7], [8, 11], [12, 13]]
[[3, 7]]
[]
Guiltless answered 25/9, 2015 at 15:10 Comment(0)
C
2

If you add the following method to your Slot class

public boolean join(Slot s) {
    if(s.start != end)
        return false;
    end = s.end;
    return true;
}

you can perform the entire operation using the standard API the following way

List<Slot> result = slots.stream().collect(ArrayList::new,
    (l, s)-> { if(l.isEmpty() || !l.get(l.size()-1).join(s)) l.add(s); },
    (l1, l2)-> l1.addAll(
        l1.isEmpty()||l2.isEmpty()||!l1.get(l1.size()-1).join(l2.get(0))?
        l2: l2.subList(1, l2.size()))
);

This obeys the contract of the API (unlike abusing reduce) and will therefore works seamlessly with parallel streams (though you need really large source lists to benefit from parallel execution).


However, the solution above uses in-place joining of Slots which is only acceptable if you don’t need the source list/items anymore. Otherwise, or if you use immutable Slot instances only, you have to create new Slot instance representing joint slots.

One possible solution looks like

BiConsumer<List<Slot>,Slot> joinWithList=(l,s) -> {
    if(!l.isEmpty()) {
        Slot old=l.get(l.size()-1);
        if(old.end==s.start) {
            l.set(l.size()-1, new Slot(old.start, s.end));
            return;
        }
    }
    l.add(s);
};
List<Slot> result = slots.stream().collect(ArrayList::new, joinWithList,
    (l1, l2)-> {
        if(!l2.isEmpty()) {
            joinWithList.accept(l1, l2.get(0));
            l1.addAll(l2.subList(1, l2.size()));
        }
    }
);
Crossways answered 25/9, 2015 at 10:22 Comment(0)
T
0

Here's a library function that can be used to help make neighbor-combining stream operations:

/**
 * Returns a {@link Collector} that returns a list which can combine elements of
 * the incoming stream into merged elements.  (Call stream() again on the list to
 * continue stream processing.)
 * @param testNeighbors Predicate that tests the preceding and next elements, returning
 *  true if they should be combined
 * @param combineNeighbors Operator that combines two neighboring items which should
 *  be combined
 * @param <R> type of the elements
 */
public static <R> Collector<R,LinkedList<R>,LinkedList<R>> combineNeighborsCollector(
        BiPredicate<R,R> testNeighbors,
        BinaryOperator<R> combineNeighbors) {
    return Collector.of(
        LinkedList::new,
        (LinkedList<R> list1, R next) -> {
            // Can't combine if list is empty.
            if (! list1.isEmpty() && testNeighbors.test(list1.getLast(), next)) {
                R lCombined = combineNeighbors.apply(list1.getLast(), next);
                list1.removeLast();
                list1.add(lCombined);
            } else {
                list1.add(next);
            }
        },
        (LinkedList<R> list1, LinkedList<R> list2) -> {
            // Can't combine if either list is empty.
            while (! list1.isEmpty() && ! list2.isEmpty()
                    && testNeighbors.test(list1.getLast(), list2.getFirst())) {
                R last = list1.getLast();
                R next = list2.getFirst();
                list1.removeLast();
                list2.removeFirst();
                list1.add(combineNeighbors.apply(last, next));
            }
            list1.addAll(list2);
            return list1;
        });
}

Testing it:

record Slot(int start, int end) {
    @Override public String toString() { return "[%d,%d]".formatted(start,end); }
}

public static final Collector<Slot, LinkedList<Slot>, LinkedList<Slot>>
SLOT_COMBINING_COLLECTOR = 
    combineNeighborsCollector(
        (Slot slot1, Slot slot2) -> slot1.end == slot2.start, // Test neighbors.
        (Slot slot1, Slot slot2) -> new Slot(slot1.start, slot2.end)); // Combine neighbors.

public static void main(String[] args) {
    System.out.println(Stream.of(
            new Slot(1, 3), new Slot(4, 7), new Slot(7, 10), new Slot(10,12), new Slot(20, 25))
        .collect(SLOT_COMBINING_COLLECTOR)
        .stream()
        .map(Objects::toString)
        .collect(Collectors.joining(", ")));
}

prints: [1,3], [4,12], [20,25]

Testate answered 27/10, 2023 at 23:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.