This is an old topic but since the previous answers didn´t fully reply to the original question, I´ll answer specifically this part :
The problem is x axis labels are too crowded and are not readable so I don't want them all too print, only some of them for better
readability.
I have tried multiple options with xticks and xticklabel functions but
none of them seem to work.
Since sns.boxplot
returns an Axis
object, there are 2 ways to set the labels :
- either using
Axis.set(xticks=...,xticklabel=...)
, in this example it would be box_plot.set(xticks=..., xticklabel=...)
,
- or using
Axis.set.x_ticks(...)
and Axis.set.x_ticklabel(...)
, in this example it would be box_plot.set.x_ticks(...)
and box_plot.set.x_ticklabel(...)
Both solutions should work, provided they have the correct parameters. Usually it would be a list of integers/floats for the position of the ticks(->position of the labels) and a list of labels for the labels that need to be plotted.
- Using
xticks
you can choose at which positions the label should be plotted, it is also possible to choose which labels to show.
xticklabel
only allows to change the labels and not their position (!), hence it should only be used after xticks
See matplotlib.axes.Axes.set_xticklabels and matplotlib.axes.Axes.set_xticks for more details and examples on how to use them.
In case the labels are floats (may be the case for a score I imagine), using round
may prove useful to reduce the size of the label and help having a clearer plot.
Here is an example on how I would write the code (using serge's answer) for printing labels once every 5 values :
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('your_data.csv', index_col=0)
box_plot = sns.boxplot(
x='Score',
y='Interest.Rate',
data=data
).set(
xlabel='FICO Score',
ylabel='Interest Rate'
)
# select one label every 5 labels
step = 5
# select which label to plot
labels = [round(data['Score'][i], 4) for i in range(len(data['Score'][i])) if i % step == 0]
# select the position of the labels
ticks = np.arange(stop = step * len(labels), step=step) # as many ticks as there are labels
# apply this setting...
box_plot.set_xticks(ticks, labels)
# ... and plot the result
plt.show()
round(data['Score'][i], 4)
will prevent having scores plotted with more than 4 digits after the decimal point
if i % step == 0
select a value only if i is a multiple of step
np.arange(stop = step * len(labels), step=step)
returns an array of integers starting at 0, increasing by step at every index and with last value smaller than stop. This seems like the easiest way to get position for the ticks, but any other list-like object that has integers or floats would have also been ok. If it has exactly len(labels)
items...
I choose box_plot.set_xticks
since there are some small operations to do on ticks and labels variables. It's more of a personal choice here.
NB : if there are problems on the positions of the ticks, the following link may help : How to properly use matplotlib's set_xticks? (or another answer to a question related to Axes.set_xticks)
AttributeError: Unknown property ylabel
– Deidradeidre