Pandas groupby agg - how to get counts?
Asked Answered
C

4

10

I am trying to get sum, mean and count of a metric

df.groupby(['id', 'pushid']).agg({"sess_length": [ np.sum, np.mean, np.count]})

But I get "module 'numpy' has no attribute 'count'", and I have tried different ways of expressing the count function but can't get it to work. How do I just an aggregate record count together with the other metrics?

Constraint answered 9/4, 2019 at 18:30 Comment(3)
Do you just want len? Not sure what you mean about different ways of expressing the count function - numpy certainly doesn't have np.count, as you've seen. What is this function expected to do?Douglasdouglashome
you can use np.sizeSimoneaux
@Simoneaux size will count nan as a row, count will exclude nanHammerhead
T
11

You can use strings instead of the functions, like so:

df = pd.DataFrame(
    {"id": list("ccdef"), "pushid": list("aabbc"), 
     "sess_length": [10, 20, 30, 40, 50]}
)

df.groupby(["id", "pushid"]).agg({"sess_length": ["sum", "mean", "count"]})

Which outputs:

           sess_length
                   sum mean count
 id pushid
 c  a               30   15     2
 d  b               30   30     1
 e  b               40   40     1
 f  c               50   50     1
Tiddlywinks answered 9/4, 2019 at 18:46 Comment(0)
E
3

just use np.size

Not sure why the answer needs to be 30 chars long, when the answer is straightforward

Extreme answered 23/3, 2023 at 12:59 Comment(1)
I was looking for that exactly. np.count_nonzero didn't look suitable.Atropos
F
1

This might work:

df.groupby(['id', 'pushid']).agg({"sess_length": [ np.sum, np.mean, np.**size**]})
Funderburk answered 28/10, 2020 at 18:50 Comment(1)
Is there a benefit to this syntax over the use of [ 'sum', 'mean', 'count'], as described in the accepted answer from last year? If so, it'd be useful to edit your answer to include that.Dwyer
D
0

I think you mean :

df.groupby(['id', 'pushid']).agg({"sess_length": [ 'sum', 'count','mean']})

As mentioned in documentation of pandas, you can use string arguments like 'sum','count'. TBH It's more preferable way of doing these aggregations.

Dogcatcher answered 9/4, 2019 at 18:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.