Pandas min() of selected row and columns
Asked Answered
Y

2

20

I am trying to create a column which contains only the minimum of the one row and a few columns, for example:

    A0      A1      A2      B0      B1      B2      C0      C1
0   0.84    0.47    0.55    0.46    0.76    0.42    0.24    0.75
1   0.43    0.47    0.93    0.39    0.58    0.83    0.35    0.39
2   0.12    0.17    0.35    0.00    0.19    0.22    0.93    0.73
3   0.95    0.56    0.84    0.74    0.52    0.51    0.28    0.03
4   0.73    0.19    0.88    0.51    0.73    0.69    0.74    0.61
5   0.18    0.46    0.62    0.84    0.68    0.17    0.02    0.53
6   0.38    0.55    0.80    0.87    0.01    0.88    0.56    0.72

Here I am trying to create a column which contains the minimum for each row of columns B0, B1, B2.

The output would look like this:

    A0      A1      A2      B0      B1      B2      C0      C1      Minimum
0   0.84    0.47    0.55    0.46    0.76    0.42    0.24    0.75    0.42
1   0.43    0.47    0.93    0.39    0.58    0.83    0.35    0.39    0.39
2   0.12    0.17    0.35    0.00    0.19    0.22    0.93    0.73    0.00
3   0.95    0.56    0.84    0.74    0.52    0.51    0.28    0.03    0.51
4   0.73    0.19    0.88    0.51    0.73    0.69    0.74    0.61    0.51
5   0.18    0.46    0.62    0.84    0.68    0.17    0.02    0.53    0.17
6   0.38    0.55    0.80    0.87    0.01    0.88    0.56    0.72    0.01

Here is part of the code, but it is not doing what I want it to do:

for i in range(0,2):
    df['Minimum'] = df.loc[0,'B'+str(i)].min()
Yves answered 25/8, 2014 at 5:40 Comment(0)
H
32

This is a one-liner, you just need to use the axis argument for min to tell it to work across the columns rather than down:

df['Minimum'] = df.loc[:, ['B0', 'B1', 'B2']].min(axis=1)

If you need to use this solution for different numbers of columns, you can use a for loop or list comprehension to construct the list of columns:

n_columns = 2
cols_to_use = ['B' + str(i) for i in range(n_columns)]
df['Minimum'] = df.loc[:, cols_to_use].min(axis=1)
Homans answered 25/8, 2014 at 6:14 Comment(6)
Thanks. But what if for different cases the number of columns change, which is why I have a for loop, where actually, I am passing a variable in the range, like so: for i in range(0,total): I tried what you suggested, but it gives me all zeros.Yves
I tried it with the for loop, but it returns the last element from the row/column, and not the minimum.Yves
@Yves I tried it with your example DataFrame... and it worked!Magner
@yash.trojan.25: See my edit for how to use this with variable numbers of columns. I'm not sure how you were getting "all zeros" but I did test this solution with your example data and got your expected output.Homans
Awesome! That worked perfectly! Exactly what I wanted. Thanks.Yves
@Homans can you explain how the : slice works here. It allows for using keys as without the keys aren't found in the index. Any pointer to documentation appreciated.Insufferable
I
3

For my tasks a universal and flexible approach is the following example:

df['Minimum'] = df[['B0', 'B1', 'B2']].apply(lambda x: min(x[0],x[1],x[2]), axis=1)

The target column 'Minimum' is assigned the result of the lambda function based on the selected DF columns['B0', 'B1', 'B2']. Access elements in a function through the function alias and his new Index(if count of elements is more then one). Be sure to specify axis=1, which indicates line-by-line calculations. This is very convenient when you need to make complex calculations. However, I assume that such a solution may be inferior in speed.

As for the selection of columns, in addition to the 'for' method, I can suggest using a filter like this:

calls_to_use = list(filter(lambda f:'B' in f, df.columns))

literally, a filter is applied to the list of DF columns through a lambda function that checks for the occurrence of the letter 'B'.

after that the first example can be written as follows:

calls_to_use = list(filter(lambda f:'B' in f, df.columns))    
df['Minimum'] = df[calls_to_use].apply(lambda x: min(x), axis=1)

although after pre-selecting the columns, it would be preferable:

df['Minimum'] = df[calls_to_use].min(axis=1)
Inshrine answered 21/3, 2019 at 10:42 Comment(1)
While this code may solve the question, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion.Virgilio

© 2022 - 2024 — McMap. All rights reserved.