How to use Pandas stylers for coloring an entire row based on a given column?

Asked 24/4, 2017 at 19:50 Answered 19/5, 2022 at 10:34

I've been trying to print out a Pandas dataframe to html and have specific entire rows highlighted if the value of one specific column's value for that row is over a threshold. I've looked through the Pandas Styler Slicing and tried to vary the highlight_max function for such a use, but seem to be failing miserably; if I try, say, to replace the is_max with a check for whether a given row's value is above said threshold (e.g., something like

is_x = df['column_name'] >= threshold

), it isn't apparent how to properly pass such a thing or what to return.

I've also tried to simply define it elsewhere using df.loc, but that hasn't worked too well either.

Another concern also came up: If I drop that column (currently the criterion) afterwards, will the styling still hold? I am wondering if a df.loc would prevent such a thing from being a problem.

Thousand answered 24/4, 2017 at 19:50 Comment(1)

Alright, I think I have your solution for highlighting an entire row based on one or more columns exceeding a value. – Manservant 25/4, 2017 at 3:25

This solution allows for you to pass a column label or a list of column labels to highlight the entire row if that value in the column(s) exceeds the threshold.

import pandas as pd
import numpy as np

np.random.seed(24)
df = pd.DataFrame({'A': np.linspace(1, 10, 10)})

df = pd.concat([df, pd.DataFrame(np.random.randn(10, 4), columns=list('BCDE'))],
               axis=1)
df.iloc[0, 2] = np.nan

def highlight_greaterthan(s, threshold, column):
    is_max = pd.Series(data=False, index=s.index)
    is_max[column] = s.loc[column] >= threshold
    return ['background-color: yellow' if is_max.any() else '' for v in is_max]


df.style.apply(highlight_greaterthan, threshold=1.0, column=['C', 'B'], axis=1)

Output:

Or for one column

df.style.apply(highlight_greaterthan, threshold=1.0, column='E', axis=1)

Manservant answered 25/4, 2017 at 3:23 Comment(4)

use instead ['background-color: yellow' if is_max.any() else ''] * len(is_max) to evaluate the is_max.any() condition only once. – Glace 28/3, 2022 at 17:27

returning 'background-color: default' in case no change is needed is a more verbose option than returning '' – Sharilyn 2/2, 2023 at 20:24

Using this solution, will the colors be saved in the .xlsx file when you save the DataFrame using to_excel(...)? docs – Bootblack 13/2, 2023 at 12:15

@MarteValerioFalcone Easy enough to test. Yes, it did save the colors to excel. – Manservant 16/2, 2023 at 14:10

Here is a simpler approach:

Assume you have a 100 x 10 dataframe, df. Also assume you want to highlight all the rows corresponding to a column, say "duration", greater than 5.

You first need to define a function that highlights the cells. The real trick is that you need to return a row, not a single cell. For example:

def highlight(s):
    if s.duration > 5:
        return ['background-color: yellow'] * len(s)
    else:
        return ['background-color: white'] * len(s)

**Note that the return part should be a list of 10 (corresponding to the number of columns). This is the key part.

Now you can apply this to the dataframe style as:
```
df.style.apply(highlight, axis=1)
```

Vitkun answered 17/1, 2018 at 17:7 Comment(1)

Instead of hard coding 10, you could just use len(s) to account for any width of dataframe – Arleenarlen 27/9, 2019 at 22:36

Assume you have the following dataframe and you want to highlight the rows where id is greater than 3 to red

   id char       date
0   0    s 2022-01-01
1   1    t 2022-02-01
2   2    y 2022-03-01
3   3    l 2022-04-01
4   4    e 2022-05-01
5   5    r 2022-06-01

You can try Styler.set_properties with pandas.IndexSlice

# Subset your original dataframe with condition
df_ = df[df['id'].gt(3)]

# Pass the subset dataframe index and column to pd.IndexSlice
slice_ = pd.IndexSlice[df_.index, df_.columns]

s = df.style.set_properties(**{'background-color': 'red'}, subset=slice_)
s.to_html('test.html')

You can also try Styler.apply with axis=None which passes the whole dataframe.

def styler(df):
    color = 'background-color: {}'.format
    mask = pd.concat([df['id'].gt(3)] * df.shape[1], axis=1)
    style = np.where(mask, color('red'), color('green'))
    return style

s = df.style.apply(styler, axis=None)

Midterm answered 19/5, 2022 at 10:34 Comment(1)

You can use pd.IndexSlice[df_.index, :] instead of pd.IndexSlice[df_.index, df_.columns]. – Ungainly 20/1, 2023 at 9:25

Recommended topics

Hot tags