Splitting List into sublists along elements
Asked Answered
E

14

70

I have this list (List<String>):

["a", "b", null, "c", null, "d", "e"]

And I'd like something like this:

[["a", "b"], ["c"], ["d", "e"]]

In other words I want to split my list in sublists using the null value as separator, in order to obtain a list of lists (List<List<String>>). I'm looking for a Java 8 solution. I've tried with Collectors.partitioningBy but I'm not sure it is what I'm looking for. Thanks!

Electronegative answered 17/3, 2015 at 9:52 Comment(13)
Certainly not an answer, but in Python, you'd do it like this: [list(v) for k, v in groupby(lst, key=lambda x: x is not None) if k]. There is Collectors.groupingBy, but to be honest I don't understand how it works...Cowardly
Python is teasing me more and more every day :) Collectors.groupingBy maps the list by a property: for example grouping ["a", "b", "aa", "aaa"] by String::length would produce {1:["a", "b"], 2:["aa"], 3:["aaa"]}... I don't know if it could be used in my caseElectronegative
Are you supposed to do it in Java 8? Because every solutions I get are poor adaptations of some Java 7 solutions stream().forEach instead of Java 7 forEach.Frolick
For example, you can create a List<List<String>> and populate it in a list.stream().forEach() but it suppose that you get the elements in a sequence and it cannot be called in parallel. In brief, you lose every pros of Streams without having the clarity of Java 7.Frolick
@Electronegative Yes, it seems Collectors.groupingBy works quite differently than Python's itertools.groupby and is more like Collectors.partitioningBy, just with more than two groups, i.e. it would put all the non-null strings in one bucket.Cowardly
I already solved this problem in a naive Java way, I'm trying to learn more about Java 8 streamsElectronegative
The problems comes form the fact that the sequence is highly important here but Streams are designed to work on independent pieces of data. This is why I think that there is no solution that will not be equivalent to a Java 7's foreach. My solution converts the problem of "Split List" into a problem of "Split String" for which Java natively implemented a solution.Frolick
I hope this goes without saying but, for production code, please keep it simple and just use for-loops. This is a great training exercise question. But it worries me that some of the solutions posted below could one day end up in production code (they got a high-score on Stackoverflow so they must be the right way to do it!). Some poor sap will tear their hair out trying to understand this!Glomerulonephritis
ArnaudDenoyelle I don't agree with you, Alexis C. showed you that it is possible (I'm testing his solution) @Mr Spoon yes, this is just a training exerciseElectronegative
@Electronegative I do not say that it is not possible. I am saying that it is equivalent to a foreach loop, in more complex. I had the same idea but when I realized that, I did not post the solution.Frolick
Possible duplicate of #28363823Kisangani
See also Java 8 Stream with batch processingAfferent
@DavidLavender are you saying we should never use streams and only use for-loops? What? Programming has many more paradigms than imperative. Even the Java language designers claim streams is the objectively better solution.Atheroma
C
32

The only solution I come up with for the moment is by implementing your own custom collector.

Before reading the solution, I want to add a few notes about this. I took this question more as a programming exercise, I'm not sure if it can be done with a parallel stream.

So you have to be aware that it'll silently break if the pipeline is run in parallel.

This is not a desirable behavior and should be avoided. This is why I throw an exception in the combiner part (instead of (l1, l2) -> {l1.addAll(l2); return l1;}), as it's used in parallel when combining the two lists, so that you have an exception instead of a wrong result.

Also this is not very efficient due to list copying (although it uses a native method to copy the underlying array).

So here's the collector implementation:

private static Collector<String, List<List<String>>, List<List<String>>> splitBySeparator(Predicate<String> sep) {
    final List<String> current = new ArrayList<>();
    return Collector.of(() -> new ArrayList<List<String>>(),
        (l, elem) -> {
            if (sep.test(elem)) {
                l.add(new ArrayList<>(current));
                current.clear();
            }
            else {
                current.add(elem);
            }
        },
        (l1, l2) -> {
            throw new RuntimeException("Should not run this in parallel");
        },
        l -> {
            if (current.size() != 0) {
                l.add(current);
                return l;
            }
        );
}

and how to use it:

List<List<String>> ll = list.stream().collect(splitBySeparator(Objects::isNull));

Output:

[[a, b], [c], [d, e]]


As the answer of Joop Eggen is out, it appears that it can be done in parallel (give him credit for that!). With that it reduces the custom collector implementation to:
private static Collector<String, List<List<String>>, List<List<String>>> splitBySeparator(Predicate<String> sep) {
    return Collector.of(() -> new ArrayList<List<String>>(Arrays.asList(new ArrayList<>())),
                        (l, elem) -> {if(sep.test(elem)){l.add(new ArrayList<>());} else l.get(l.size()-1).add(elem);},
                        (l1, l2) -> {l1.get(l1.size() - 1).addAll(l2.remove(0)); l1.addAll(l2); return l1;});
}

which let the paragraph about parallelism a bit obsolete, however I let it as it can be a good reminder.


Note that the Stream API is not always a substitute. There are tasks that are easier and more suitable using the streams and there are tasks that are not. In your case, you could also create a utility method for that:

private static <T> List<List<T>> splitBySeparator(List<T> list, Predicate<? super T> predicate) {
    final List<List<T>> finalList = new ArrayList<>();
    int fromIndex = 0;
    int toIndex = 0;
    for(T elem : list) {
        if(predicate.test(elem)) {
            finalList.add(list.subList(fromIndex, toIndex));
            fromIndex = toIndex + 1;
        }
        toIndex++;
    }
    if(fromIndex != toIndex) {
        finalList.add(list.subList(fromIndex, toIndex));
    }
    return finalList;
}

and call it like List<List<String>> list = splitBySeparator(originalList, Objects::isNull);.

It can be improved for checking edge-cases.

Chigetai answered 17/3, 2015 at 10:30 Comment(1)
Plus one for documenting limitations and for taking a Predicate as a parameter.Inflect
I
82

Although there are several answers already, and an accepted answer, there are still a couple points missing from this topic. First, the consensus seems to be that solving this problem using streams is merely an exercise, and that the conventional for-loop approach is preferable. Second, the answers given thus far have overlooked an approach using array or vector-style techniques that I think improves the streams solution considerably.

First, here's a conventional solution, for purposes of discussion and analysis:

static List<List<String>> splitConventional(List<String> input) {
    List<List<String>> result = new ArrayList<>();
    int prev = 0;

    for (int cur = 0; cur < input.size(); cur++) {
        if (input.get(cur) == null) {
            result.add(input.subList(prev, cur));
            prev = cur + 1;
        }
    }
    result.add(input.subList(prev, input.size()));

    return result;
}

This is mostly straightforward but there's a bit of subtlety. One point is that a pending sublist from prev to cur is always open. When we encounter null we close it, add it to the result list, and advance prev. After the loop we close the sublist unconditionally.

Another observation is that this is a loop over indexes, not over the values themselves, thus we use an arithmetic for-loop instead of the enhanced "for-each" loop. But it suggests that we can stream using the indexes to generate subranges instead of streaming over values and putting the logic into the collector (as was done by Joop Eggen's proposed solution).

Once we've realized that, we can see that each position of null in the input is the delimiter for a sublist: it's the right end of the sublist to the left, and it (plus one) is the left end of the sublist to the right. If we can handle the edge cases, it leads to an approach where we find the indexes at which null elements occur, map them to sublists, and collect the sublists.

The resulting code is as follows:

static List<List<String>> splitStream(List<String> input) {
    int[] indexes = Stream.of(IntStream.of(-1),
                              IntStream.range(0, input.size())
                                       .filter(i -> input.get(i) == null),
                              IntStream.of(input.size()))
                          .flatMapToInt(s -> s)
                          .toArray();

    return IntStream.range(0, indexes.length-1)
                    .mapToObj(i -> input.subList(indexes[i]+1, indexes[i+1]))
                    .collect(toList());
}

Getting the indexes at which null occurs is pretty easy. The stumbling block is adding -1 at the left and size at the right end. I've opted to use Stream.of to do the appending and then flatMapToInt to flatten them out. (I tried several other approaches but this one seemed like the cleanest.)

It's a bit more convenient to use arrays for the indexes here. First, the notation for accessing an array is nicer than for a List: indexes[i] vs. indexes.get(i). Second, using an array avoids boxing.

At this point, each index value in the array (except for the last) is one less than the beginning position of a sublist. The index to its immediate right is the end of the sublist. We simply stream over the array and map each pair of indexes into a sublist and collect the output.

Discussion

The streams approach is slightly shorter than the for-loop version, but it's denser. The for-loop version is familiar, because we do this stuff in Java all the time, but if you're not already aware of what this loop is supposed to be doing, it's not obvious. You might have to simulate a few loop executions before you figure out what prev is doing and why the open sublist has to be closed after the end of the loop. (I initially forgot to have it, but I caught this in testing.)

The streams approach is, I think, easier to conceptualize what's going on: get a list (or an array) that indicates the boundaries between sublists. That's an easy streams two-liner. The difficulty, as I mentioned above, is finding a way to tack the edge values onto the ends. If there were a better syntax for doing this, e.g.,

    // Java plus pidgin Scala
    int[] indexes =
        [-1] ++ IntStream.range(0, input.size())
                         .filter(i -> input.get(i) == null) ++ [input.size()];

it would make things a lot less cluttered. (What we really need is array or list comprehension.) Once you have the indexes, it's a simple matter to map them into actual sublists and collect them into the result list.

And of course this is safe when run in parallel.

UPDATE 2016-02-06

Here's a nicer way to create the array of sublist indexes. It's based on the same principles, but it adjusts the index range and adds some conditions to the filter to avoid having to concatenate and flatmap the indexes.

static List<List<String>> splitStream(List<String> input) {
    int sz = input.size();
    int[] indexes =
        IntStream.rangeClosed(-1, sz)
                 .filter(i -> i == -1 || i == sz || input.get(i) == null)
                 .toArray();

    return IntStream.range(0, indexes.length-1)
                    .mapToObj(i -> input.subList(indexes[i]+1, indexes[i+1]))
                    .collect(toList());
}

UPDATE 2016-11-23

I co-presented a talk with Brian Goetz at Devoxx Antwerp 2016, "Thinking In Parallel" (video) that featured this problem and my solutions. The problem presented there is a slight variation that splits on "#" instead of null, but it's otherwise the same. In the talk, I mentioned that I had a bunch of unit tests for this problem. I've appended them below, as a standalone program, along with my loop and streams implementations. An interesting exercise for readers is to run solutions proposed in other answers against the test cases I've provided here, and to see which ones fail and why. (The other solutions will have to be adapted to split based on a predicate instead of splitting on null.)

import java.util.*;
import java.util.function.*;
import java.util.stream.*;

import static java.util.Arrays.asList;

public class ListSplitting {
    static final Map<List<String>, List<List<String>>> TESTCASES = new LinkedHashMap<>();
    static {
        TESTCASES.put(asList(),
                  asList(asList()));
        TESTCASES.put(asList("a", "b", "c"),
                  asList(asList("a", "b", "c")));
        TESTCASES.put(asList("a", "b", "#", "c", "#", "d", "e"),
                  asList(asList("a", "b"), asList("c"), asList("d", "e")));
        TESTCASES.put(asList("#"),
                  asList(asList(), asList()));
        TESTCASES.put(asList("#", "a", "b"),
                  asList(asList(), asList("a", "b")));
        TESTCASES.put(asList("a", "b", "#"),
                  asList(asList("a", "b"), asList()));
        TESTCASES.put(asList("#"),
                  asList(asList(), asList()));
        TESTCASES.put(asList("a", "#", "b"),
                  asList(asList("a"), asList("b")));
        TESTCASES.put(asList("a", "#", "#", "b"),
                  asList(asList("a"), asList(), asList("b")));
        TESTCASES.put(asList("a", "#", "#", "#", "b"),
                  asList(asList("a"), asList(), asList(), asList("b")));
    }

    static final Predicate<String> TESTPRED = "#"::equals;

    static void testAll(BiFunction<List<String>, Predicate<String>, List<List<String>>> f) {
        TESTCASES.forEach((input, expected) -> {
            List<List<String>> actual = f.apply(input, TESTPRED);
            System.out.println(input + " => " + expected);
            if (!expected.equals(actual)) {
                System.out.println("  ERROR: actual was " + actual);
            }
        });
    }

    static <T> List<List<T>> splitStream(List<T> input, Predicate<? super T> pred) {
        int[] edges = IntStream.range(-1, input.size()+1)
                               .filter(i -> i == -1 || i == input.size() ||
                                       pred.test(input.get(i)))
                               .toArray();

        return IntStream.range(0, edges.length-1)
                        .mapToObj(k -> input.subList(edges[k]+1, edges[k+1]))
                        .collect(Collectors.toList());
    }

    static <T> List<List<T>> splitLoop(List<T> input, Predicate<? super T> pred) {
        List<List<T>> result = new ArrayList<>();
        int start = 0;

        for (int cur = 0; cur < input.size(); cur++) {
            if (pred.test(input.get(cur))) {
                result.add(input.subList(start, cur));
                start = cur + 1;
            }
        }
        result.add(input.subList(start, input.size()));

        return result;
    }

    public static void main(String[] args) {
        System.out.println("===== Loop =====");
        testAll(ListSplitting::splitLoop);
        System.out.println("===== Stream =====");
        testAll(ListSplitting::splitStream);
    }
}
Inflect answered 17/3, 2015 at 22:36 Comment(2)
For those who are interested in more insight: a lecture by the answerer talking about this question.Jubilee
@Jubilee Thanks for posting the link to the talk! This has prompted me to update the answer with the unit tests that I had mentioned in the talk.Inflect
C
32

The only solution I come up with for the moment is by implementing your own custom collector.

Before reading the solution, I want to add a few notes about this. I took this question more as a programming exercise, I'm not sure if it can be done with a parallel stream.

So you have to be aware that it'll silently break if the pipeline is run in parallel.

This is not a desirable behavior and should be avoided. This is why I throw an exception in the combiner part (instead of (l1, l2) -> {l1.addAll(l2); return l1;}), as it's used in parallel when combining the two lists, so that you have an exception instead of a wrong result.

Also this is not very efficient due to list copying (although it uses a native method to copy the underlying array).

So here's the collector implementation:

private static Collector<String, List<List<String>>, List<List<String>>> splitBySeparator(Predicate<String> sep) {
    final List<String> current = new ArrayList<>();
    return Collector.of(() -> new ArrayList<List<String>>(),
        (l, elem) -> {
            if (sep.test(elem)) {
                l.add(new ArrayList<>(current));
                current.clear();
            }
            else {
                current.add(elem);
            }
        },
        (l1, l2) -> {
            throw new RuntimeException("Should not run this in parallel");
        },
        l -> {
            if (current.size() != 0) {
                l.add(current);
                return l;
            }
        );
}

and how to use it:

List<List<String>> ll = list.stream().collect(splitBySeparator(Objects::isNull));

Output:

[[a, b], [c], [d, e]]


As the answer of Joop Eggen is out, it appears that it can be done in parallel (give him credit for that!). With that it reduces the custom collector implementation to:
private static Collector<String, List<List<String>>, List<List<String>>> splitBySeparator(Predicate<String> sep) {
    return Collector.of(() -> new ArrayList<List<String>>(Arrays.asList(new ArrayList<>())),
                        (l, elem) -> {if(sep.test(elem)){l.add(new ArrayList<>());} else l.get(l.size()-1).add(elem);},
                        (l1, l2) -> {l1.get(l1.size() - 1).addAll(l2.remove(0)); l1.addAll(l2); return l1;});
}

which let the paragraph about parallelism a bit obsolete, however I let it as it can be a good reminder.


Note that the Stream API is not always a substitute. There are tasks that are easier and more suitable using the streams and there are tasks that are not. In your case, you could also create a utility method for that:

private static <T> List<List<T>> splitBySeparator(List<T> list, Predicate<? super T> predicate) {
    final List<List<T>> finalList = new ArrayList<>();
    int fromIndex = 0;
    int toIndex = 0;
    for(T elem : list) {
        if(predicate.test(elem)) {
            finalList.add(list.subList(fromIndex, toIndex));
            fromIndex = toIndex + 1;
        }
        toIndex++;
    }
    if(fromIndex != toIndex) {
        finalList.add(list.subList(fromIndex, toIndex));
    }
    return finalList;
}

and call it like List<List<String>> list = splitBySeparator(originalList, Objects::isNull);.

It can be improved for checking edge-cases.

Chigetai answered 17/3, 2015 at 10:30 Comment(1)
Plus one for documenting limitations and for taking a Predicate as a parameter.Inflect
P
24

The solution is to use Stream.collect. To create a Collector using its builder pattern is already given as solution. The alternative is the other overloaded collect being a tiny bit more primitive.

    List<String> strings = Arrays.asList("a", "b", null, "c", null, "d", "e");
    List<List<String>> groups = strings.stream()
            .collect(() -> {
                List<List<String>> list = new ArrayList<>();
                list.add(new ArrayList<>());
                return list;
            },
            (list, s) -> {
                if (s == null) {
                    list.add(new ArrayList<>());
                } else {
                    list.get(list.size() - 1).add(s);
                }
            },
            (list1, list2) -> {
                // Simple merging of partial sublists would
                // introduce a false level-break at the beginning.
                list1.get(list1.size() - 1).addAll(list2.remove(0));
                list1.addAll(list2);
            });

As one sees, I make a list of string lists, where there always is at least one last (empty) string list.

  • The first function creates a starting list of string lists. It specifies the result (typed) object.
  • The second function is called to process each element. It is an action on the partial result and an element.
  • The third is not really used, it comes into play on parallelising the processing, when partial results must be combined.

A solution with an accumulator:

As @StuartMarks points out, the combiner does not fullfill the contract for parallelism.

Due to the comment of @ArnaudDenoyelle a version using reduce.

    List<List<String>> groups = strings.stream()
            .reduce(new ArrayList<List<String>>(),
                    (list, s) -> {
                        if (list.isEmpty()) {
                            list.add(new ArrayList<>());
                        }
                        if (s == null) {
                            list.add(new ArrayList<>());
                        } else {
                            list.get(list.size() - 1).add(s);
                        }
                        return list;
                    },
                    (list1, list2) -> {
                            list1.addAll(list2);
                            return list1;
                    });
  • The first parameter is the accumulated object.
  • The second function accumulates.
  • The third is the aforementioned combiner.
Paternal answered 17/3, 2015 at 11:55 Comment(10)
Using an accumulator seems to be the most appropriate way. I upvoted.Frolick
@ArnaudDenoyelle yes, looking more functional. Why not give an answer yourself? My answer wanted to allow new Stream users to actively use its functionality for programming.Paternal
I'm giving an upvote because it seems to work in parallel! Good job! You coud replace the supplier part with () -> new ArrayList<List<String>>(Arrays.asList(new ArrayList<>()))Chigetai
I added a new version of the custom collector implementation with a reference to your answer. Hope you don't mind, let me know if any :-)Chigetai
@ArnaudDenoyelle nice work, tempted me to add an accumulating version, by reduce. Somehow this "new-in-java" Stream API is nicely perverse.Paternal
@JoopEggen Actually, I meant that your version is nice because it uses an accumulator (the 2nd parameter of Stream.collect, as suggests its name) but I also like the 2nd solution :)Frolick
Nice work on the collector, especially the combiner. +1. But the version using reduce doesn't work in parallel, as the first arg is not an identity -- it gets mutated as the reduction progresses.Inflect
@Holger I think the first solution is correct. The convention is that the list to the left of the split (the rightmost element of the left-hand list) always represents an open sublist, so it's always proper to merge into it. If a split occurs immediately to the right of a null, that null would have caused the accumulator to append an empty list to the left-hand list. Thus the break is preserved.Inflect
list.get(list.size() - 1).addAll(list2.remove(0)); should be: list1.get(list1.size() - 1).addAll(list2.remove(0));Eccentric
@ReutSharabani thanks for pointing that out, I corrected the answer.Paternal
F
8

Please do not vote. I do not have enough place to explain this in comments.

This is a solution with a Stream and a foreach but this is strictly equivalent to Alexis's solution or a foreach loop (and less clear, and I could not get rid of the copy constructor) :

List<List<String>> result = new ArrayList<>();
final List<String> current = new ArrayList<>();
list.stream().forEach(s -> {
      if (s == null) {
        result.add(new ArrayList<>(current));
        current.clear();
      } else {
        current.add(s);
      }
    }
);
result.add(current);

System.out.println(result);

I understand that you want to find a more elegant solution with Java 8 but I truly think that it has not been designed for this case. And as said by Mr spoon, highly prefer the naive way in this case.

Frolick answered 17/3, 2015 at 11:16 Comment(0)
A
5

Although the answer of Marks Stuart is concise, intuitive and parallel safe (and the best), I want to share another interesting solution that doesn't need the start/end boundaries trick.

If we look at the problem domain and think about parallelism, we can easy solve this with a divide-and-conquer strategy. Instead of thinking about the problem as a serial list we have to traverse, we can look at the problem as a composition of the same basic problem: splitting a list at a null value. We can intuitively see quite easily that we can recursively break down the problem with the following recursive strategy:

split(L) :
  - if (no null value found) -> return just the simple list
  - else -> cut L around 'null' naming the resulting sublists L1 and L2
            return split(L1) + split(L2)

In this case, we first search any null value and the moment find one, we immediately cut the list and invoke a recursive call on the sublists. If we don't find null (the base case), we are finished with this branch and just return the list. Concatenating all the results will return the List we are searching for.

A picture is worth a thousand words:

enter image description here

The algorithm is simple and complete: we don't need any special tricks to handle the edge cases of the start/end of the list. We don't need any special tricks to handle edge cases such as empty lists, or lists with only null values. Or lists ending with null or starting with null.

A simple naive implementation of this strategy looks as follows:

public List<List<String>> split(List<String> input) {

    OptionalInt index = IntStream.range(0, input.size())
                                 .filter(i -> input.get(i) == null)
                                 .findAny();

    if (!index.isPresent())
        return asList(input);

    List<String> firstHalf  = input.subList(0, index.getAsInt());
    List<String> secondHalf = input.subList(index.getAsInt()+1, input.size());

    return asList(firstHalf, secondHalf).stream()
                 .map(this::split)
                 .flatMap(List::stream)
                 .collect(toList());

}

We first search for the index of any null value in the list. If we don't find one, we return the list. If we find one, we split the list in 2 sublists, stream over them and recursively call the split method again. The resulting lists of the sub-problem are then extracted and combined for the return value.

Remark that the 2 streams can easily be made parallel() and the algorithm will still work because of the functional decomposition of the problem.

Although the code is already pretty concise, it can always be adapted in numerous ways. For the sake of an example, instead of checking the optional value in the base case, we could take advantage of the orElse method on the OptionalInt to return the end-index of the list, enabling us to re-use the second stream and additionally filter out empty lists:

public List<List<String>> split(List<String> input) {

    int index =  IntStream.range(0, input.size())
                          .filter(i -> input.get(i) == null)
                          .findAny().orElse(input.size());

    return asList(input.subList(0, index), input.subList(index+1, input.size())).stream()
                 .map(this::split)
                 .flatMap(List::stream)
                 .filter(list -> !list.isEmpty())
                 .collect(toList());
}

The example is only given to indicate the mere simplicity, adaptability and elegance of a recursive approach. Indeed, this version would introduce a small performance penalty and fail if the input was empty (and as such might need an extra empty check).

In this case, recursion might probably not be the best solution (Stuart Marks algorithm to find indexes is only O(N) and mapping/splitting lists has a significant cost), but it expresses the solution with a simple, intuitive parallelizable algorithm without any side effects.

I won't digg deeper into the complexity and advantages/disadvantages or use cases with stop criteria and/or partial result availability. I just felt the need to share this solution strategy, since the other approaches were merely iterative or using an overly complex solution algorithm that was not parallelizable.

Ageold answered 13/11, 2016 at 11:47 Comment(0)
E
4

Here's another approach, which uses a grouping function, which makes use of list indices for grouping.

Here I'm grouping the element by the first index following that element, with value null. So, in your example, "a" and "b" would be mapped to 2. Also, I'm mapping null value to -1 index, which should be removed later on.

List<String> list = Arrays.asList("a", "b", null, "c", null, "d", "e");

Function<String, Integer> indexGroupingFunc = (str) -> {
             if (str == null) {
                 return -1;
             }
             int index = list.indexOf(str) + 1;
             while (index < list.size() && list.get(index) != null) {
                 index++;
             }
             return index;
         };

Map<Integer, List<String>> grouped = list.stream()
               .collect(Collectors.groupingBy(indexGroupingFunc));

grouped.remove(-1);  // Remove null elements grouped under -1
System.out.println(grouped.values()); // [[a, b], [c], [d, e]]

You can also avoid getting the first index of null element every time, by caching the current min index in an AtomicInteger. The updated Function would be like:

AtomicInteger currentMinIndex = new AtomicInteger(-1);

Function<String, Integer> indexGroupingFunc = (str) -> {
        if (str == null) {
            return -1;
        }
        int index = names.indexOf(str) + 1;

        if (currentMinIndex.get() > index) {
            return currentMinIndex.get();
        } else {
            while (index < names.size() && names.get(index) != null) {
              index++;
            }
            currentMinIndex.set(index);
            return index;
        }
    };
Ethicize answered 17/3, 2015 at 11:7 Comment(0)
C
3

This is a very interesting problem. I came up with a one line solution. It might not very performant but it works.

List<String> list = Arrays.asList("a", "b", null, "c", null, "d", "e");
Collection<List<String>> cl = IntStream.range(0, list.size())
    .filter(i -> list.get(i) != null).boxed()
    .collect(Collectors.groupingBy(
        i -> IntStream.range(0, i).filter(j -> list.get(j) == null).count(),
        Collectors.mapping(i -> list.get(i), Collectors.toList()))
    ).values();

It is a similar idea that @Rohit Jain came up with. I'm grouping the space between the null values. If you really want a List<List<String>> you may append:

List<List<String>> ll = cl.stream().collect(Collectors.toList());
Congratulation answered 17/3, 2015 at 13:7 Comment(2)
1 line with 300 columns ;)Bitartrate
I think you are not aware that lambda are not incompatible with well-formatted code.Bitartrate
R
3

Well, after a bit of work U have come up with a one-line stream-based solution. It ultimately uses reduce() to do the grouping, which seemed the natural choice, but it was a bit ugly getting the strings into the List<List<String>> required by reduce:

List<List<String>> result = list.stream()
  .map(Arrays::asList)
  .map(x -> new LinkedList<String>(x))
  .map(Arrays::asList)
  .map(x -> new LinkedList<List<String>>(x))
  .reduce( (a, b) -> {
    if (b.getFirst().get(0) == null) 
      a.add(new LinkedList<String>());
    else
      a.getLast().addAll(b.getFirst());
    return a;}).get();

It is however 1 line!

When run with input from the question,

System.out.println(result);

Produces:

[[a, b], [c], [d, e]]
Revelatory answered 18/3, 2015 at 14:23 Comment(1)
Hmmm... the reducer doesn't appear to be associative. Clever way to get each string into a LinkedList<List<String>>. I bet that was a puzzle.Inflect
S
1

Here is code by abacus-common

List<String> list = N.asList(null, null, "a", "b", null, "c", null, null, "d", "e");
Stream.of(list).splitIntoList(null, (e, any) -> e == null, null).filter(e -> e.get(0) != null).forEach(N::println);

Declaration: I'm the developer of abacus-common.

Skipton answered 24/11, 2016 at 17:6 Comment(0)
M
1

Group by different token whenever you find a null (or separator). I used here a different integer (used atomic just as holder)

Then remap the generated map to transform it into a list of lists.

AtomicInteger i = new AtomicInteger();
List<List<String>> x = Stream.of("A", "B", null, "C", "D", "E", null, "H", "K")
      .collect(Collectors.groupingBy(s -> s == null ? i.incrementAndGet() : i.get()))
      .entrySet().stream().map(e -> e.getValue().stream().filter(v -> v != null).collect(Collectors.toList()))
      .collect(Collectors.toList());

System.out.println(x);
Mesdemoiselles answered 19/1, 2017 at 13:57 Comment(1)
Nice one Shadi !Contrasty
P
0

In my StreamEx library there's a groupRuns method which can help you to solve this:

List<String> input = Arrays.asList("a", "b", null, "c", null, "d", "e");
List<List<String>> result = StreamEx.of(input)
        .groupRuns((a, b) -> a != null && b != null)
        .remove(list -> list.get(0) == null).toList();

The groupRuns method takes a BiPredicate which for the pair of adjacent elements returns true if they should be grouped. After that we remove groups containing nulls and collect the rest to the List.

This solution is parallel-friendly: you may use it for parallel stream as well. Also it works nice with any stream source (not only random access lists like in some other solutions) and it's somewhat better than collector-based solutions as here you can use any terminal operation you want without intermediate memory waste.

Pinzler answered 15/7, 2015 at 10:20 Comment(2)
that's a very neat method. why remove instead of a filter there?Lenoralenore
@Eugene, that's a matter of preference. Use .filter(list -> list.get(0) != null) if you like it better.Pinzler
D
0

With String one can do:

String s = ....;
String[] parts = s.split("sth");

If all sequential collections (as the String is a sequence of chars) had this abstraction this could be doable for them too:

List<T> l = ...
List<List<T>> parts = l.split(condition) (possibly with several overloaded variants)

If we restrict the original problem to List of Strings (and imposing some restrictions on it's elements contents) we could hack it like this:

String als = Arrays.toString(new String[]{"a", "b", null, "c", null, "d", "e"});
String[] sa = als.substring(1, als.length() - 1).split("null, ");
List<List<String>> res = Stream.of(sa).map(s -> Arrays.asList(s.split(", "))).collect(Collectors.toList());

(please don't take it seriously though :))

Otherwise, plain old recursion also works:

List<List<String>> part(List<String> input, List<List<String>> acc, List<String> cur, int i) {
    if (i == input.size()) return acc;
    if (input.get(i) != null) {
        cur.add(input.get(i));
    } else if (!cur.isEmpty()) {
        acc.add(cur);
        cur = new ArrayList<>();
    }
    return part(input, acc, cur, i + 1);
}

(note in this case null has to be appended to the input list)

part(input, new ArrayList<>(), new ArrayList<>(), 0)
Dispersal answered 8/12, 2016 at 19:2 Comment(0)
N
0

I was watching the video on Thinking in Parallel by Stuart. So decided to solve it before seeing his response in the video. Will update the solution with time. for now

Arrays.asList(IntStream.range(0, abc.size()-1).
filter(index -> abc.get(index).equals("#") ).
map(index -> (index)).toArray()).
stream().forEach( index -> {for (int i = 0; i < index.length; i++) {
                    if(sublist.size()==0){
                        sublist.add(new ArrayList<String>(abc.subList(0, index[i])));
                    }else{

                    sublist.add(new ArrayList<String>(abc.subList(index[i]-1, index[i])));
                    }
                }
    sublist.add(new ArrayList<String>(abc.subList(index[index.length-1]+1, abc.size())));
});
Neman answered 4/7, 2017 at 20:45 Comment(0)
P
0

Actually I found a pretty general solution to this one, basically using takeWhile method stream API

So input given:

[a, b, null, c, null, d, e]

Apply the split function:

[[a, b], [null], [c], [null], [d, e]]

Just filter it with each sublist not containing null:

[a, b], [c], [d, e]]

The below code iterates a list with a increase delta(this allows to skip to next (right) index of the list). To each representative it maps into the sublist when the takeWhile method as given me some output. And it's done, now you have either right matchings and wrong matchings so just filter the output to the desired predicate, in your case x -> !x.contains(null) or better appliedSplitList.stream().filter(l -> l.stream().allMatch(Objects::nonNull)).collect(Collectors.toList()) , and you're welcome.

private int delta;

public <X> List<List<X>> split (List<X> list, Predicate<X> pred) {
    return Stream.iterate(0, d -> d < list.size(), i -> i + delta)
        .map (k -> 
            Optional.of(subList(k, list, pred))
                .filter(h-> !h.isEmpty())
                .orElse(subList(k, list, pred.negate()))
        )
        .peek(t -> delta = t.size())
        .collect(Collectors.toList());
}

private <X> List<X> subList(final int offset, final List<X> list, final, Predicate<X> pred) {
    return list.stream().skip(offset).takeWhile(pred).collect(Collectors.toList());
}

Priam answered 30/12, 2023 at 23:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.