I have a data frame with categories and values. I need to find the value in each category closest to a value. I think I'm close but I can't really get the right output when applying the results of argsort to the original dataframe.
For example, if the input was defined in the code below the output should have only (a, 1, True)
, (b, 2, True)
, (c, 2, True)
and all other isClosest Values
should be False.
If multiple values are closest then it should be the first value listed marked.
Here is the code I have which works but I can't get it to reapply to the dataframe correctly. I would love some pointers.
df = pd.DataFrame()
df['category'] = ['a', 'b', 'b', 'b', 'c', 'a', 'b', 'c', 'c', 'a']
df['values'] = [1, 2, 3, 4, 5, 4, 3, 2, 1, 0]
df['isClosest'] = False
uniqueCategories = df['category'].unique()
for c in uniqueCategories:
filteredCategories = df[df['category']==c]
sortargs = (filteredCategories['value']-2.0).abs().argsort()
#how to use sortargs so that we set column in df isClosest=True if its the closest value in each category to 2.0?