I have a pandas dataset that I want to downsize (remove all values under x).
The mask is df[my_column] > 50
I would typically just use df = df[mask]
, but want to avoid making a copy every time, particularly because it gets error prone when used in functions (as it only gets altered in the function scope).
What is the best way to subset a dataset inplace?
I was thinking of something along the lines of
df.drop(df.loc[mask].index, inplace = True)
Is there a better way to do this, or any situation where this won't work at all?
view = df.loc[df[my_column] > 50]
? – Marrowbonedf = df[mask]
? this will eventually recover the memory for the dropped rows? – Marrowbonemask
itself is a boolean index – Marrowbonedf.drop(df.loc[mask].index, inplace = True)
seems to work, but I expect there might be a better solution (as mine will probably fail on multi-level indexes etc) – Huffman