Calculate cumulative sum from last non-zero entry in python
Asked Answered
D

2

5

I have a numeric series like [0,0,0,0,1,1,1,0,0,1,1,0]. I would like to calculate the numeric sum from the last non-zero values. i.e the cumsum will be reset to zero once a zero entry occurs.

input: [0,0,0,0,1,1,1,0,0,1,1,0]
output:[0,0,0,0,1,2,3,0,0,1,2,0] 

Is there a built-in python function able to achieve this? Or better way to calculate it without loop?

Discuss answered 12/6, 2019 at 22:10 Comment(2)
Can you use numpy?Vondavonni
yes, do numpy and pandas have any function related?Discuss
C
9

You can do it with itertools.accumulate. In addition to passing an iterable as the first argument, it accepts an optional 2nd argument that should be a two-argument function where the first argument is the accumulated result and the second argument is the current value from the iterable. You can pass a fairly simple lambda as the optional 2nd argument to calculate the running total unless the current value is zero.

from itertools import accumulate

nums = [0,0,0,0,1,1,1,0,0,1,1,0]

result = accumulate(nums, lambda acc, n: acc + n if n else 0)
print(list(result))
# [0, 0, 0, 0, 1, 2, 3, 0, 0, 1, 2, 0]
Centum answered 12/6, 2019 at 22:18 Comment(2)
thanks for your quick response. perfect! btw, can accumulate function be used on dataframe? Say, I have a matrix, where each column is a numeric series. I would like to calculate the cumsum for each column.Discuss
@Discuss sure, itertools.accumulate will accept any iterable, so lots of ways to use with a dataframe, series, etc.Centum
L
1

We can do this in numpy with two passes of np.cumsum(..). First we calculate the cumsum of the array:

a = np.array([0,0,0,0,1,1,1,0,0,1,1,0])
c = np.cumsum(a)

This gives us:

>>> c
array([0, 0, 0, 0, 1, 2, 3, 3, 3, 4, 5, 5])

Next we filter a on elements where the value is 0 and we elementwise calculate the difference between that element and its predecessor:

corr = np.diff(np.hstack(((0,), c[a == 0])))

then this is the correction we need to apply on those elements:

>>> corr
array([0, 0, 0, 0, 3, 0, 2])

We can then make a copy of a (or do this inplace), and subtract the correction:

a2 = a.copy()
a2[a == 0] -= corr

this gives us:

>>> a2
array([ 0,  0,  0,  0,  1,  1,  1, -3,  0,  1,  1, -2])

and now we can calculate the cummulative sum of a2 that will reset to 0 for an 0, since the correction keeps track of the increments in between:

>>> a2.cumsum()
array([0, 0, 0, 0, 1, 2, 3, 0, 0, 1, 2, 0])

or as a function:

import numpy as np

def cumsumreset(iterable, reset=0):
    a = np.array(iterable)
    c = a.cumsum()
    a2 = a.copy()
    filter = a == reset
    a2[filter] -= np.diff(np.hstack(((0,), c[filter])))
    return a2.cumsum()

this then gives us:

>>> cumsumreset([0,0,0,0,1,1,1,0,0,1,1,0])
array([0, 0, 0, 0, 1, 2, 3, 0, 0, 1, 2, 0])
Lesleelesley answered 12/6, 2019 at 23:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.