I have a DataFrame with a column that has some bad data with various negative values. I would like to replace values < 0 with the mean of the group that they are in.
For missing values as NAs, I would do:
data = df.groupby(['GroupID']).column
data.transform(lambda x: x.fillna(x.mean()))
But how to do this operation on a condition like x < 0
?
Thanks!
.transform(lambda x: x.where(x>=0).fillna(x[x>=0].mean()))
but didn't like the repetition of the condition. Your approach bypasses that nicely. The pattern seems common enough that I wonder ifpandas
should grow a built-in way to support it. – Flashcube