How to find the cumulative sum of numbers in a list?

Asked 8/4, 2013 at 21:15 Answered 3/1, 2023 at 9:21

135

time_interval = [4, 6, 12]

I want to sum up the numbers like [4, 4+6, 4+6+12] in order to get the list t = [4, 10, 22].

I tried the following:

t1 = time_interval[0]
t2 = time_interval[1] + t1
t3 = time_interval[2] + t2
print(t1, t2, t3)  # -> 4 10 22

Hughes answered 8/4, 2013 at 21:15 Comment(1)

See also https://mcmap.net/q/168668/-elegant-pythonic-cumsum-duplicate – Polik 20/8, 2016 at 14:59

170

If you're doing much numerical work with arrays like this, I'd suggest numpy, which comes with a cumulative sum function cumsum:

import numpy as np

a = [4,6,12]

np.cumsum(a)
#array([4, 10, 22])

Numpy is often faster than pure python for this kind of thing, see in comparison to @Ashwini's accumu:

In [136]: timeit list(accumu(range(1000)))
10000 loops, best of 3: 161 us per loop

In [137]: timeit list(accumu(xrange(1000)))
10000 loops, best of 3: 147 us per loop

In [138]: timeit np.cumsum(np.arange(1000))
100000 loops, best of 3: 10.1 us per loop

But of course if it's the only place you'll use numpy, it might not be worth having a dependence on it.

Silverfish answered 8/4, 2013 at 21:31 Comment(10)

This should have a np.cumsun case that starts with a list, to take into account the conversion time. – Polik 20/8, 2016 at 14:56

Good point @hpaulj, for those starting from (or aiming for) a list I would not recommend numpy. – Silverfish 23/8, 2016 at 19:9

I don't think numpy is fastest #15889631 – Mispleading 16/9, 2016 at 15:18

Agreed, as I mentioned above. Avoiding reactions like yours and @hpaulj's is why I tried to limit its scope in the very first and last lines of my answer :-/ – Silverfish 16/9, 2016 at 15:54

@Silverfish is there any reason that the np.cumsum timing has 100k loops whereas the other two have only 10k loops? I don't think it will make a material difference, just wondering. – Bellyache 7/6, 2018 at 18:9

@alex: Using timeit, "if -n is not given, a suitable number of loops is calculated by trying successive powers of 10 until the total time is at least 0.2 seconds." If you expect it to make a difference, you can supply -n 1000 to make them all equivalent. – Silverfish 13/6, 2018 at 18:23

I would suggest np.cumsum for np.array lists and itertools.accumulate otherwise. This answer is still the best one, but it's outdated. – Lowgrade 17/1, 2020 at 23:26

Please add run complexity is it implemented in O(n) or O(n^2)? – Amortize 16/6, 2020 at 10:40

Why is this answer not on the top? There seems to be a problem with the ranking system. – Rhizotomy 3/9, 2021 at 6:6

In this answer the np.cumsum version does not do the conversion back to a list. This operation may change the ranking – Dotdotage 27/4, 2022 at 15:19

126

In Python 2 you can define your own generator function like this:

def accumu(lis):
    total = 0
    for x in lis:
        total += x
        yield total

In [4]: list(accumu([4,6,12]))
Out[4]: [4, 10, 22]

And in Python 3.2+ you can use itertools.accumulate():

In [1]: lis = [4,6,12]

In [2]: from itertools import accumulate

In [3]: list(accumulate(lis))
Out[3]: [4, 10, 22]

Brit answered 8/4, 2013 at 21:19 Comment(3)

PEP 572 -- Assignment Expressions (expected for Python 3.8) shows an interesting alternative total = 0; partial_sums = [total := total + v for v in values]. I would still expect accumulate to be faster. – Extenuate 15/7, 2018 at 16:46

@StevenRumbalski Man, I personally think that's the worst PEP ever. Bad enough... – Brit 15/7, 2018 at 23:10

@AshwiniChaudhary: This example may let the PEP look worse than it is. It has some quite readable and very useful applications. For instance, if (value := some_dict.get(key)) is not None: ... which avoids the notorious double lookup. Or also in list comprehensions with an if filter like [func_result for x in xs if satisfies_property(func_result := f(x))] which also avoids the notorious double evaluation of the function f(x). – Avignon 25/6, 2022 at 20:35

Try the itertools.accumulate() function.

import itertools  

list(itertools.accumulate([1,2,3,4,5]))
# [1, 3, 6, 10, 15]

Disparagement answered 13/12, 2018 at 3:51 Comment(1)

You don't need to to pass operator.add as the default operation is addition anyway. – Hogle 23/3, 2019 at 18:18

I did a bench-mark of the top two answers with Python 3.4 and I found itertools.accumulate is faster than numpy.cumsum under many circumstances, often much faster. However, as you can see from the comments, this may not always be the case, and it's difficult to exhaustively explore all options. (Feel free to add a comment or edit this post if you have further benchmark results of interest.)

Some timings...

For short lists accumulate is about 4 times faster:

from timeit import timeit

def sum1(l):
    from itertools import accumulate
    return list(accumulate(l))

def sum2(l):
    from numpy import cumsum
    return list(cumsum(l))

l = [1, 2, 3, 4, 5]

timeit(lambda: sum1(l), number=100000)
# 0.4243644131347537
timeit(lambda: sum2(l), number=100000)
# 1.7077815784141421

For longer lists accumulate is about 3 times faster:

l = [1, 2, 3, 4, 5]*1000
timeit(lambda: sum1(l), number=100000)
# 19.174508565105498
timeit(lambda: sum2(l), number=100000)
# 61.871223849244416

If the numpy array is not cast to list, accumulate is still about 2 times faster:

from timeit import timeit

def sum1(l):
    from itertools import accumulate
    return list(accumulate(l))

def sum2(l):
    from numpy import cumsum
    return cumsum(l)

l = [1, 2, 3, 4, 5]*1000

print(timeit(lambda: sum1(l), number=100000))
# 19.18597290944308
print(timeit(lambda: sum2(l), number=100000))
# 37.759664884768426

If you put the imports outside of the two functions and still return a numpy array, accumulate is still nearly 2 times faster:

from timeit import timeit
from itertools import accumulate
from numpy import cumsum

def sum1(l):
    return list(accumulate(l))

def sum2(l):
    return cumsum(l)

l = [1, 2, 3, 4, 5]*1000

timeit(lambda: sum1(l), number=100000)
# 19.042188624851406
timeit(lambda: sum2(l), number=100000)
# 35.17324400227517

Mispleading answered 16/9, 2016 at 15:12 Comment(6)

You wouldn't expect an airplane to be faster than the train to travel across town, especially including ticket purchase and security screening. Likewise you wouldn't use numpy to process a list of five items, especially if you are unwilling to accept an array in return. If the list in question is really so short, then their running time would be inconsequential---dependencies and legibility would surely dominate. But wide usage of a list of uniform numerical data type of significant length would be silly; for that, a numpy array would be appropriate, and usually faster. – Silverfish 16/9, 2016 at 16:7

@Silverfish well I don't just find this for short lists and the OP's question asks for a list as output rather than a numpy array. Perhaps you can edit your answer to be clearer on when each use is appropriate :) – Mispleading 16/9, 2016 at 16:23

@Silverfish In fact I've edited my answer with a much more detailed comparison. Under no circumstances, do I find numpy to be faster, unless I've overlooked something? – Mispleading 16/9, 2016 at 18:12

Oh my, yes indeed :) I wouldn't say you've overlooked something, but the comparison is hard to make in isolation without considering your inputs and outputs. Most of the time in your sum2 function is probably in converting l into an array. Try timing a = np.array(l) and np.cumsum(a) separately. Then try a = np.tile(np.arange(1, 6), 1000) vs l = [1,2,3,4,5]*1000. In a program conducting other numerical processes (like the creation or loading of l in the first place) your working data would probably be in an array already, and the creation would be a constant cost. – Silverfish 16/9, 2016 at 18:39

@Silverfish I got the same idea as you and therefore I did time the a = np.array(l). For the sum2 without the transformation to list, and with a numpy array as input, sum2 is 5 times faster thank sum1 in my computer in case of the long list/array. – Selden 24/3, 2017 at 15:47

Please add run time complexity for each implementation – Amortize 16/6, 2020 at 10:42

Behold:

a = [4, 6, 12]
reduce(lambda c, x: c + [c[-1] + x], a, [0])[1:]

Will output (as expected):

[4, 10, 22]

Labyrinthodont answered 9/10, 2015 at 9:43 Comment(2)

Not efficient. The total expense of performing c + [c[-1] + x] over and over adds up to a total runtime quadratic in the input length. – Syverson 22/5, 2017 at 16:36

reduce is good for a one-off cumulative sum, but if you're doing a lot of calls to your cumsum function a generator will be useful to "preprocess" your cumulative_sum values and access them in O(1) for each subsequent call. – Retrorse 1/7, 2017 at 14:6

Assignment expressions from PEP 572 (new in Python 3.8) offer yet another way to solve this:

time_interval = [4, 6, 12]

total_time = 0
cum_time = [total_time := total_time + t for t in time_interval]

Extenuate answered 15/7, 2018 at 16:54 Comment(0)

You can calculate the cumulative sum list in linear time with a simple for loop:

def csum(lst):
    s = lst.copy()
    for i in range(1, len(s)):
        s[i] += s[i-1]
    return s

time_interval = [4, 6, 12]
print(csum(time_interval))  # [4, 10, 22]

The standard library's itertools.accumulate may be a faster alternative (since it's implemented in C):

from itertools import accumulate
time_interval = [4, 6, 12]
print(list(accumulate(time_interval)))  # [4, 10, 22]

Hogle answered 23/3, 2019 at 16:59 Comment(0)

Since python 3.8 it's possible to use Assignment expressions, so things like this became easier to implement

nums = list(range(1, 10))
print(f'array: {nums}')

v = 0
cumsum = [v := v + n for n in nums]
print(f'cumsum: {cumsum}')

produces

array: [1, 2, 3, 4, 5, 6, 7, 8, 9]
cumsum: [1, 3, 6, 10, 15, 21, 28, 36, 45]

The same technique can be applied to find the cum product, mean, etc.

p = 1
cumprod = [p := p * n for n in nums]
print(f'cumprod: {cumprod}')

s = 0
c = 0
cumavg = [(s := s + n) / (c := c + 1) for n in nums]
print(f'cumavg: {cumavg}')

results in

cumprod: [1, 2, 6, 24, 120, 720, 5040, 40320, 362880]
cumavg: [1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]

Queston answered 16/8, 2021 at 10:18 Comment(0)

If You want a pythonic way without numpy working in 2.7 this would be my way of doing it

l = [1,2,3,4]
_d={-1:0}
cumsum=[_d.setdefault(idx, _d[idx-1]+item) for idx,item in enumerate(l)]

now let's try it and test it against all other implementations

import timeit, sys
L=list(range(10000))
if sys.version_info >= (3, 0):
    reduce = functools.reduce
    xrange = range


def sum1(l):
    cumsum=[]
    total = 0
    for v in l:
        total += v
        cumsum.append(total)
    return cumsum


def sum2(l):
    import numpy as np
    return list(np.cumsum(l))

def sum3(l):
    return [sum(l[:i+1]) for i in xrange(len(l))]

def sum4(l):
    return reduce(lambda c, x: c + [c[-1] + x], l, [0])[1:]

def this_implementation(l):
    _d={-1:0}
    return [_d.setdefault(idx, _d[idx-1]+item) for idx,item in enumerate(l)]


# sanity check
sum1(L)==sum2(L)==sum3(L)==sum4(L)==this_implementation(L)
>>> True    

# PERFORMANCE TEST
timeit.timeit('sum1(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.001018061637878418

timeit.timeit('sum2(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.000829620361328125

timeit.timeit('sum3(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.4606760001182556 

timeit.timeit('sum4(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.18932826995849608

timeit.timeit('this_implementation(L)','from __main__ import sum1,sum2,sum3,sum4,this_implementation,L', number=100)/100.
>>> 0.002348129749298096

Chanson answered 16/11, 2017 at 15:38 Comment(0)

There could be many answers for this depending on the length of the list and the performance. One very simple way which I can think without thinking of the performance is this:

a = [1, 2, 3, 4]
a = [sum(a[0:x]) for x in range(1, len(a)+1)]
print(a)

[1, 3, 6, 10]

This is by using list comprehension and this may work fairly well it is just that here I am adding over the subarray many times, you could possibly improvise on this and make it simple!

Cheers to your endeavor!

Cahier answered 6/5, 2020 at 13:47 Comment(1)

This approach is O(n²), so it only makes sense to use it for tiny lists. – Hogle 11/9, 2021 at 18:40

First, you want a running list of subsequences:

subseqs = (seq[:i] for i in range(1, len(seq)+1))

Then you just call sum on each subsequence:

sums = [sum(subseq) for subseq in subseqs]

(This isn't the most efficient way to do it, because you're adding all of the prefixes repeatedly. But that probably won't matter for most use cases, and it's easier to understand if you don't have to think of the running totals.)

If you're using Python 3.2 or newer, you can use itertools.accumulate to do it for you:

sums = itertools.accumulate(seq)

And if you're using 3.1 or earlier, you can just copy the "equivalent to" source straight out of the docs (except for changing next(it) to it.next() for 2.5 and earlier).

Heikeheil answered 8/4, 2013 at 21:19 Comment(4)

This runs in quadratic time (maybe that doesn't matter for the OP, but worth mentioning). – Jackquelinejackrabbit 8/4, 2013 at 21:24

First, when N=3, who cares about quadratic time? And I don't think it's overcomplicated. It's two very simple steps, each transforming one iterator into another, directly translating the English-language description. (The fact that he's using an uncommon way of defining series, where the 0-length prefix isn't counted, does make it a bit more complicated… but that's inherent in the problem, and I thought it was better to put that in the range than to hack around it by doing [1:] at the end, or to ignore it.) – Heikeheil 8/4, 2013 at 21:26

Presumably the OP's actual problem isn't to get the partial sums of [4,6,12] since, as he wrote in the question, he already knows what that is! – Jackquelinejackrabbit 8/4, 2013 at 21:27

@ChrisTaylor: He explicitly said that he already knows how to write this, but wants "an easier way to write it". – Heikeheil 8/4, 2013 at 21:35

values = [4, 6, 12]
total  = 0
sums   = []

for v in values:
  total = total + v
  sums.append(total)

print 'Values: ', values
print 'Sums:   ', sums

Running this code gives

Values: [4, 6, 12]
Sums:   [4, 10, 22]

Jackquelinejackrabbit answered 8/4, 2013 at 21:21 Comment(0)

Try this:

result = []
acc = 0
for i in time_interval:
    acc += i
    result.append(acc)

Sola answered 8/4, 2013 at 21:21 Comment(0)

lst = [4, 6, 12]

[sum(lst[:i+1]) for i in xrange(len(lst))]

If you are looking for a more efficient solution (bigger lists?) a generator could be a good call (or just use numpy if you really care about performance).

def gen(lst):
    acu = 0
    for num in lst:
        yield num + acu
        acu += num

print list(gen([4, 6, 12]))

Subminiaturize answered 8/4, 2013 at 21:24 Comment(0)

l = [1,-1,3]
cum_list = l

def sum_list(input_list):
    index = 1
    for i in input_list[1:]:
        cum_list[index] = i + input_list[index-1]
        index = index + 1 
    return cum_list

print(sum_list(l))

Suint answered 8/8, 2019 at 17:32 Comment(0)

In Python3, To find the cumulative sum of a list where the ith element is the sum of the first i+1 elements from the original list, you may do:

a = [4 , 6 , 12]
b = []
for i in range(0,len(a)):
    b.append(sum(a[:i+1]))
print(b)

OR you may use list comprehension:

b = [sum(a[:x+1]) for x in range(0,len(a))]

Output

[4,10,22]

Life answered 30/8, 2019 at 10:10 Comment(1)

This looks right but can drop a link to the documentation, without that I cannot upvote. – Xe 16/6, 2020 at 18:3

In [42]: a = [4, 6, 12]

In [43]: [sum(a[:i+1]) for i in xrange(len(a))]
Out[43]: [4, 10, 22]

This is slighlty faster than the generator method above by @Ashwini for small lists

In [48]: %timeit list(accumu([4,6,12]))
  100000 loops, best of 3: 2.63 us per loop

In [49]: %timeit [sum(a[:i+1]) for i in xrange(len(a))]
  100000 loops, best of 3: 2.46 us per loop

For larger lists, the generator is the way to go for sure. . .

In [50]: a = range(1000)

In [51]: %timeit [sum(a[:i+1]) for i in xrange(len(a))]
  100 loops, best of 3: 6.04 ms per loop

In [52]: %timeit list(accumu(a))
  10000 loops, best of 3: 162 us per loop

Rembert answered 8/4, 2013 at 21:27 Comment(2)

You're timing for just 3 item list, try for 10^4 items. – Brit 8/4, 2013 at 21:32

True, for larger lists the generator is far faster! – Rembert 8/4, 2013 at 21:33

Somewhat hacky, but seems to work:

def cumulative_sum(l):
  y = [0]
  def inc(n):
    y[0] += n
    return y[0]
  return [inc(x) for x in l]

I did think that the inner function would be able to modify the y declared in the outer lexical scope, but that didn't work, so we play some nasty hacks with structure modification instead. It is probably more elegant to use a generator.

Piecemeal answered 30/12, 2013 at 9:52 Comment(0)

Without having to use Numpy, you can loop directly over the array and accumulate the sum along the way. For example:

a=range(10)
i=1
while((i>0) & (i<10)):
    a[i]=a[i-1]+a[i]
    i=i+1
print a

Results in:

[0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

Sestet answered 11/2, 2014 at 15:26 Comment(0)

A pure python oneliner for cumulative sum:

cumsum = lambda X: X[:1] + cumsum([X[0]+X[1]] + X[2:]) if X[1:] else X

This is a recursive version inspired by recursive cumulative sums. Some explanations:

The first term X[:1] is a list containing the previous element and is almost the same as [X[0]] (which would complain for empty lists).
The recursive cumsum call in the second term processes the current element [1] and remaining list whose length will be reduced by one.
if X[1:] is shorter for if len(X)>1.

Test:

cumsum([4,6,12])
#[4, 10, 22]

cumsum([])
#[]

And simular for cumulative product:

cumprod = lambda X: X[:1] + cumprod([X[0]*X[1]] + X[2:]) if X[1:] else X

Test:

cumprod([4,6,12])
#[4, 24, 288]

Frannie answered 23/8, 2018 at 1:26 Comment(0)

Here's another fun solution. This takes advantage of the locals() dict of a comprehension, i.e. local variables generated inside the list comprehension scope:

>>> [locals().setdefault(i, (elem + locals().get(i-1, 0))) for i, elem 
     in enumerate(time_interval)]
[4, 10, 22]

Here's what the locals() looks for each iteration:

>>> [[locals().setdefault(i, (elem + locals().get(i-1, 0))), locals().copy()][1] 
     for i, elem in enumerate(time_interval)]
[{'.0': <enumerate at 0x21f21f7fc80>, 'i': 0, 'elem': 4, 0: 4},
 {'.0': <enumerate at 0x21f21f7fc80>, 'i': 1, 'elem': 6, 0: 4, 1: 10},
 {'.0': <enumerate at 0x21f21f7fc80>, 'i': 2, 'elem': 12, 0: 4, 1: 10, 2: 22}]

Performance is not terrible for small lists:

>>> %timeit list(accumulate([4, 6, 12]))
387 ns ± 7.53 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

>>> %timeit np.cumsum([4, 6, 12])
5.31 µs ± 67.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

>>> %timeit [locals().setdefault(i, (e + locals().get(i-1,0))) for i,e in enumerate(time_interval)]
1.57 µs ± 12 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

And obviously falls flat for larger lists.

>>> l = list(range(1_000_000))
>>> %timeit list(accumulate(l))
95.1 ms ± 5.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit np.cumsum(l)
79.3 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit np.cumsum(l).tolist()
120 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit [locals().setdefault(i, (e + locals().get(i-1, 0))) for i, e in enumerate(l)]
660 ms ± 5.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Even though the method is ugly and not practical, it sure is fun.

Stoush answered 9/8, 2020 at 18:17 Comment(0)

I think the below code is the easiest:

a=[1,1,2,1,2]
b=[a[0]]+[sum(a[0:i]) for i in range(2,len(a)+1)]

Telegony answered 11/2, 2022 at 13:44 Comment(0)

    def cumulative_sum(list):
      l = []
        for i in range(len(list)):
         new_l = sum(list[:i+1])
         l.append(new_l)
        return l

    time_interval = [4, 6, 12]
    print(cumulative_sum(time_interval)

Maybe a more beginner-friendly solution.

Condyloma answered 8/12, 2022 at 21:20 Comment(0)

So you need to make a list of cumulative sums. You can do it by using for loop and .append() method

time_interval = [4, 6, 12]

cumulative_sum = []
new_sum = 0
for i in time_interval:
    new_sum += i
    cumulative_sum.append(new_sum)
print(cumulative_sum)

or, using numpy module

import numpy

time_interval = [4, 6, 12]
c_sum = numpy.cumsum(time_interval)
print(c_sum.tolist())

Beeswax answered 3/1, 2023 at 9:21 Comment(0)

-3

This would be Haskell-style:

def wrand(vtlg):

    def helpf(lalt,lneu): 

        if not lalt==[]:
            return helpf(lalt[1::],[lalt[0]+lneu[0]]+lneu)
        else:
            lneu.reverse()
            return lneu[1:]        

    return helpf(vtlg,[0])

Cosmogony answered 14/5, 2016 at 4:34 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags