Changing X axis labels in seaborn boxplot
Asked Answered
D

3

19

I have a pandas dataframe with multiple columns I am trying to plot the column "Score" (on x axis) with another column called "interest rate". I am using the following commands:

box_plot=sns.boxplot(x=list(Dataframe['Score']),y=list(Dataframe['Interest.Rate']),data=Dataframe)
box_plot.set(xlabel='FICO Score',ylabel='Interest Rate')

This works fine and it create a boxplot with appropriate axes. Seems like I have to pass the variables as list in boxplot function. Maybe there is better way to do it.

The problem is x axis labels are too crowded and are not readable so I don't want them all too print, only some of them for better readability.

I have tried multiple options with xticks and xticklabel functions but none of them seem to work.

Dashboard answered 9/5, 2016 at 6:25 Comment(0)
C
24

you could do simply this:

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv('your_data.csv', index_col=0)

sns.boxplot(
    x='Score', 
    y='Interest.Rate', 
    data=data
).set(
    xlabel='FICO Score', 
    ylabel='Interest Rate'
)
plt.show()
Cornelie answered 16/3, 2018 at 9:10 Comment(2)
doesn't work AttributeError: Unknown property ylabelDeidradeidre
It does work, the xlabel command should be in the set API.Ssr
K
1

try it this way:

box_plot=sns.boxplot(x='Score', y='Interest.Rate',data=Dataframe)

instead of converting pandas series to lists

if you need help with the X axis please post sample data set which helps to reproduce your problem.

Kathe answered 9/5, 2016 at 11:52 Comment(0)
R
0

This is an old topic but since the previous answers didn´t fully reply to the original question, I´ll answer specifically this part :

The problem is x axis labels are too crowded and are not readable so I don't want them all too print, only some of them for better readability.

I have tried multiple options with xticks and xticklabel functions but none of them seem to work.

Since sns.boxplot returns an Axis object, there are 2 ways to set the labels :

  • either using Axis.set(xticks=...,xticklabel=...), in this example it would be box_plot.set(xticks=..., xticklabel=...),
  • or using Axis.set.x_ticks(...) and Axis.set.x_ticklabel(...), in this example it would be box_plot.set.x_ticks(...) and box_plot.set.x_ticklabel(...)

Both solutions should work, provided they have the correct parameters. Usually it would be a list of integers/floats for the position of the ticks(->position of the labels) and a list of labels for the labels that need to be plotted.

  • Using xticks you can choose at which positions the label should be plotted, it is also possible to choose which labels to show.
  • xticklabel only allows to change the labels and not their position (!), hence it should only be used after xticks

See matplotlib.axes.Axes.set_xticklabels and matplotlib.axes.Axes.set_xticks for more details and examples on how to use them.

In case the labels are floats (may be the case for a score I imagine), using round may prove useful to reduce the size of the label and help having a clearer plot.

Here is an example on how I would write the code (using serge's answer) for printing labels once every 5 values :

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv('your_data.csv', index_col=0)

box_plot = sns.boxplot(
    x='Score', 
    y='Interest.Rate', 
    data=data
).set(
    xlabel='FICO Score', 
    ylabel='Interest Rate'
)
# select one label every 5 labels
step = 5
# select which label to plot
labels = [round(data['Score'][i], 4) for i in range(len(data['Score'][i])) if i % step == 0]
# select the position of the labels
ticks = np.arange(stop = step * len(labels), step=step)  # as many ticks as there are labels
# apply this setting...
box_plot.set_xticks(ticks, labels)
# ... and plot the result
plt.show()

round(data['Score'][i], 4) will prevent having scores plotted with more than 4 digits after the decimal point
if i % step == 0 select a value only if i is a multiple of step
np.arange(stop = step * len(labels), step=step) returns an array of integers starting at 0, increasing by step at every index and with last value smaller than stop. This seems like the easiest way to get position for the ticks, but any other list-like object that has integers or floats would have also been ok. If it has exactly len(labels) items...
I choose box_plot.set_xticks since there are some small operations to do on ticks and labels variables. It's more of a personal choice here.

NB : if there are problems on the positions of the ticks, the following link may help : How to properly use matplotlib's set_xticks? (or another answer to a question related to Axes.set_xticks)

Randeerandel answered 6/10, 2022 at 12:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.