Pandas Value Counts With Constraint For More Than One Occurance
Asked Answered
P

1

7

Working with the Wine Review Data from Kaggle here. I am able to return the number of occurrences by variety using value_counts()

enter image description here

However, I am trying to find a quick way to limit the results to varieties and their counts where there is more than one occurrence.

Trying df.loc[df['variety'].value_counts()>1].value_counts() and df['variety'].loc[df['variety'].value_counts()>1].value_counts() both return errors.

The results can be turned into a DataFrame and the constraint added there, but something tells me that there is a way more elegant way to achieve this.

enter image description here

Pilpul answered 9/5, 2018 at 16:50 Comment(2)
try df['variety'].value_counts().loc[lambda x : x>1]Weitzel
@WenThat did the trick. Do you have a link to the resource for using lambda in this way? Or I should ask. Can you use a lambda expression with loc as a constraint on the results anytime you are using an aggregate function?Pilpul
P
16

@wen ansered this in the comments.

df['variety'].value_counts().loc[lambda x : x>1] 
Pilpul answered 9/5, 2018 at 18:23 Comment(3)
This answer could benefit from a little bit of explenation.Gravedigger
I think this provides answer to the question. Author was trying to find quick one-liner to filter out those unique data and see only data with more than 1 variety. This does the trick just nice! ThanksAnemone
How do you group this set and create a value like "others"?Lovellalovelock

© 2022 - 2024 — McMap. All rights reserved.