Count NaNs when using value_counts() on a dataframe
Asked Answered
T

2

25

I want to count the number of occurrences over two columns of a DataFrame :

No Name
1   A  
1   A
5   T
9   V
Nan M
5   T
1   A

I expected df[["No", "Name"]].value_counts() to give

No Name Count
1   A     3
5   T     2
9   V     1
Nan M     1

But I am missing the row containing NaN.

Is there a way to include NaNs in value_counts()?

Trometer answered 27/6, 2021 at 19:54 Comment(0)
H
10

You can use groupby with dropna=False:

df.groupby(['No', 'Name'], dropna=False, as_index=False).size()

Output:

    No Name  size
0  1.0    A     3
1  5.0    T     2
2  9.0    V     1
3  NaN    M     1

P.S. Interestingly enough, pd.Series.value_counts method also supports dropna argument, but pd.DataFrame.value_counts method does not


Update As pointed out in the other answer, value_counts now also supports dropna=False. This was introduced in v1.3.0, which was released after my original answer was posted

Hewitt answered 27/6, 2021 at 19:57 Comment(4)
However this doesn't give me a dataframe. I want a dataframe with three coulmns as mentioned above.Trometer
@Trometer You're right, sorry, just add as_index=False. Updated my answerHewitt
@Trometer Great! Could you please accept the answer then?Hewitt
Of course! I have to wait for like 2 minutes before I can do that. I think there is a time duration after which we can accept.Trometer
J
38

You can still use value_counts() but with dropna=False rather than True (the default value), as follows:

df[["No", "Name"]].value_counts(dropna=False)

So, the result will be as follows:

   No   Name    size
0   1     A     3
1   5     T     2
2   9     V     1
3   NaN   M     1
Jaggery answered 28/5, 2022 at 14:56 Comment(0)
H
10

You can use groupby with dropna=False:

df.groupby(['No', 'Name'], dropna=False, as_index=False).size()

Output:

    No Name  size
0  1.0    A     3
1  5.0    T     2
2  9.0    V     1
3  NaN    M     1

P.S. Interestingly enough, pd.Series.value_counts method also supports dropna argument, but pd.DataFrame.value_counts method does not


Update As pointed out in the other answer, value_counts now also supports dropna=False. This was introduced in v1.3.0, which was released after my original answer was posted

Hewitt answered 27/6, 2021 at 19:57 Comment(4)
However this doesn't give me a dataframe. I want a dataframe with three coulmns as mentioned above.Trometer
@Trometer You're right, sorry, just add as_index=False. Updated my answerHewitt
@Trometer Great! Could you please accept the answer then?Hewitt
Of course! I have to wait for like 2 minutes before I can do that. I think there is a time duration after which we can accept.Trometer

© 2022 - 2024 — McMap. All rights reserved.