Cumulative Sum using Java 8 stream API
Asked Answered
N

6

21

I have a List of Integer say list1, and I want to get another list list2 which will contain the cumulative sum up until the current index from start. How can I do this using Stream API java 8 ?

List<Integer> list1 = new ArrayList<>();
list1.addAll(Arrays.asList(1, 2, 3, 4));
List<Integer> list2 = new ArrayList<>();
// initialization
list2.add(list1.get(0));
for(int i=1;i<list1.size();i++) {
// increment step
    list2.add(list2.get(i-1) + list1.get(i));
}

How can I change above imperative style code into declarative one ?

list2 should be [1, 3, 6, 10]
Nardone answered 20/3, 2019 at 16:32 Comment(4)
You'll notice from the answers that any solution using streams is going to be inefficient. Streams aren't really intended for this use case. (In general, streams aren't intended to replace all imperative code.)Muscle
@LouisWasserman I agree. But in this case I do not care about efficiency (probably should have mentioned it in the question). I wanted to write the same thing with something other than above mentioned imperative style. I could not do it myself, as I was stuck because the solution is kind of mixture of map and reduce operationNardone
Man, the lack of tuples really hurts here.Trelliswork
An extremely similar question was asked yesterday: #55230761Guimar
R
21

Streams are not suited for this kind of task, as there is state involved (the cumulative partial sum). Instead, you could use Arrays.parallelPrefix:

Integer[] arr = list1.toArray(Integer[]::new);

Arrays.parallelPrefix(arr, Integer::sum);

List<Integer> list2 = Arrays.asList(arr);

This first copies list1 to an array by using Collection.toArray, which is available since JDK 11. If you are not on Java 11 yet, you could replace the first line with the traditional toArray call:

Integer[] arr = list1.toArray(new Integer[0]);

This solution doesn't use streams, yet it's declarative, because Arrays.parallelPrefix receives the cumulative operation as an argument (Integer::sum in this case).

Time complexity is O(N), though there might be some non-minor constant costs involved associated with setting up the infrastructure needed for parallel processing. However, according to the docs:

Parallel prefix computation is usually more efficient than sequential loops for large arrays

So it seems it's worth giving this approach a try.

Also, it's worth mentioning that this approach works because Integer::sum is an associative operation. This is a requirement.

Rhineland answered 20/3, 2019 at 16:54 Comment(4)
Never heard of parallel prefix. thank you very much for teaching me a new concept. :). I will definitely give a try.Nardone
@Nardone Here's more reading about the subject.Rhineland
Don’t make your life unnecessary hard. Use list1.toArray(new Integer[0]). As elaborated in this great article, using a zero size is even more efficient in the commonly used Hotspot JVM. That’s one of the reasons why JDK 11’s new method just does the same: “The default implementation calls the generator function with zero and then passes the resulting array to toArray(T[]).Hydrofoil
@Nardone also worth reading: How does newly introduced Arrays.parallelPrefix(…) in Java 8 work?Hydrofoil
M
6

For every index: iterate from zero to that index, get each element, and get the sum
Box the ints to Integers
Collect to a list

IntStream.range(0, list1.size())
    .map(i -> IntStream.rangeClosed(0, i).map(list1::get).sum())
    .boxed()
    .collect(Collectors.toList());

You're adding every number together every time, rather than reusing the previous cumulative result, but streams do not lend themselves to looking at results from previous iterations.

You could write your own collector but at this point, honestly why are you even bothering with streams?

list1.stream()
    .collect(
        Collector.of(
            ArrayList::new,
            (a, b) -> a.add(a.isEmpty() ? b : b + a.get(a.size() - 1)),
            (a, b) -> { throw new UnsupportedOperationException(); }
        )
    );
Massage answered 20/3, 2019 at 16:41 Comment(3)
That's O(n^2) !! (as opposed to the O(n) of the OP)Gump
@Gump Yes.Massage
@Massage thanks for your elaborate response. I have learnt that I can write my own collector. I know this is overkill. But good to know that there are other ways.Nardone
S
6

An O(n) (works only sequentially) solution would be the following, but I don't find it very elegant. I guess it is a matter of taste

AtomicInteger ai = new AtomicInteger();
List<Integer> collect = list1.stream()
                             .map(ai::addAndGet)
                             .collect(Collectors.toList());
System.out.println(collect); // [1, 3, 6, 10]
Spurrier answered 20/3, 2019 at 16:49 Comment(4)
This solution only works as long as the stream is sequential. Parallelising it would break this completely.Toler
@ben In parallel stream you couldn't have cumulative sum right?Illene
Ruslan's answer allows for parallel streaming. The only stateful operation is over the original list itself, which we can probably assume with relative safety is read-only. Each item in the list in his solution could query over the list independently of the other stream elements.Toler
@VenkataraghavanYanamandram with other solutions yes, but they are O(n^2)Spurrier
C
3

You can use sublist to sum up until the current index from start:

List<Integer> list = IntStream.range(0, list1.size())
        .mapToObj(i -> list1.subList(0, i + 1).stream().mapToInt(Integer::intValue).sum())
        .collect(Collectors.toList());
Corkscrew answered 20/3, 2019 at 16:43 Comment(1)
Highly inefficient. O(N^2) vs O(N) of other answers.Laurinelaurita
T
3

You can just use Stream.collect() for that:

List<Integer> list1 = Arrays.asList(1, 2, 3, 4);
List<Integer> list2 = list1.stream()
        .collect(ArrayList::new, (sums, number) -> {
            if (sums.isEmpty()) {
                sums.add(number);
            } else {
                sums.add(sums.get(sums.size() - 1) + number);
            }
        }, (sums1, sums2) -> {
            if (!sums1.isEmpty()) {
                int sum = sums1.get(sums1.size() - 1);
                sums2.replaceAll(num -> sum + num);
            }
            sums1.addAll(sums2);
        });

This solution also works for parallel streams. Use list1.parallelStream() or list1.stream().parallel() instead of list1.stream().

The result in both cases is: [1, 3, 6, 10]

Talavera answered 20/3, 2019 at 17:54 Comment(3)
This should be the accepted answer as it is correct regardless whether the stream is parallel or not and O(n) in any case.Kwangtung
Here's my take at making this stateless: https://mcmap.net/q/659652/-is-it-possible-to-calculate-values-in-a-stream-based-on-themselvesKwangtung
@Kwangtung This is not O(n). Notice that in combiner sums2.replaceAll(num -> sum + num); line is O(n) instead of O(1). In worst case this will be O(n^2) which will occur when parallel processing is done for each individual element in the array.Laurinelaurita
A
1

The JEP 461: Stream Gatherers Java 22 preview language feature adds built-in support for this sort of cumulative accumulation:

// [1, 3, 6, 10]
List<Integer> incrementalSums = Stream.of(1, 2, 3, 4)
        .gather(Gatherers.scan(() -> 0, Integer::sum))
        .toList();

This uses the new Stream.gather method with the new built-in Gatherers.scan gatherer to convert the initial stream to a stream of sums.

Javadocs

Gatherer:

An intermediate operation that transforms a stream of input elements into a stream of output elements, optionally applying a final action when the end of the upstream is reached. […]

[…]

There are many examples of gathering operations, including but not limited to: grouping elements into batches (windowing functions); de-duplicating consecutively similar elements; incremental accumulation functions (prefix scan); incremental reordering functions, etc. The class Gatherers provides implementations of common gathering operations.

Stream.gather:

Returns a stream consisting of the results of applying the given gatherer to the elements of this stream.

Gatherers.scan

Returns a Gatherer that performs a Prefix Scan -- an incremental accumulation -- using the provided functions. Starting with an initial value obtained from the Supplier, each subsequent value is obtained by applying the BiFunction to the current value and the next input element, after which the resulting value is produced downstream.

Example:

// will contain: ["1", "12", "123", "1234", "12345", "123456", "1234567", "12345678", "123456789"]
List<String> numberStrings =
    Stream.of(1,2,3,4,5,6,7,8,9)
          .gather(
              Gatherers.scan(() -> "", (string, number) -> string + number)
           )
          .toList();
Achelous answered 22/12, 2023 at 5:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.