You could use groupby
+ cumcount
+ mask
here:
m = df.colA.isnull()
df['Sequence'] = df.groupby(m.cumsum()).cumcount().sub(1).mask(m, 0)
Or, use clip_lower
in the last step and you don't have to pre-cache m
:
df['Sequence'] = df.groupby(df.colA.isnull().cumsum()).cumcount().sub(1).clip_lower(0)
df
colA Sequence
0 NaN 0
1 True 0
2 True 1
3 True 2
4 True 3
5 NaN 0
6 True 0
7 NaN 0
8 NaN 0
9 True 0
10 True 1
11 True 2
12 True 3
13 True 4
Timings
df = pd.concat([df] * 10000, ignore_index=True)
# Timing the alternatives in this answer
%%timeit
m = df.colA.isnull()
df.groupby(m.cumsum()).cumcount().sub(1).mask(m, 0)
23.3 ms ± 1.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
df.groupby(df.colA.isnull().cumsum()).cumcount().sub(1).clip_lower(0)
24.1 ms ± 1.93 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# @user2314737's solution
%%timeit
df.groupby((df['colA'] != df['colA'].shift(1)).cumsum()).cumcount()
29.8 ms ± 345 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# @jezrael's solution
%%timeit
a = df['colA'].isnull()
b = a.cumsum()
(b-b.where(~a).add(1).ffill().fillna(0).astype(int)).clip_lower(0)
11.5 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Note, your mileage may vary, depending on the data.