I would like to df.drop_duplicates()
based off a subset, but also ignore if a column has a specific value.
For example...
v1 v2 v3
ID
148 8751704.0 G dog
123 9082007.0 G dog
123 9082007.0 G dog
123 9082007.0 G cat
I would like to drop duplicate [ID, v1]
but ignore if v3
is equal to cat
so something like this:
full_df.drop_duplicates([ID, v1], inplace=True, conditional=exclude v3 = cat)
Hope that makes sense
KeyError: ('ID', 'v1')
– Davao