How to widen boxes in Seaborn boxplot?
Asked Answered
S

2

7

I'm trying to make a grouped boxplot using Seaborn (Reference), and the boxes are all incredibly narrow -- too narrow to see the grouping colors.

g = seaborn.factorplot("project_code",y="num_mutations",hue="organ",
        data=grouped_donor, kind="box", aspect=3)

enter image description here

If I zoom in, or stretch the graphic several times the width of my screen, I can see the boxes, but obviously this isn't useful as a standard graphic.

This appears to be a function of my amount of data; if I plot only the first 500 points (of 6000), I get visible-but-small boxes. It might specifically be a function of the high variance of my data; according to the matplotlib boxplot documentation,

The default [width] is 0.5, or 0.15x(distance between extreme positions) if that is smaller.

Regardless of the reason, there's plenty of room on the graph itself for wider boxes, if I could just widen them.

Unfortunately, the boxplot keyword widths which controls the box width isn't a valid factorplot keyword, and I can't find a matplotlib function that'll change the width of a bar or box outside of the plotting function itself. I can't even find anyone discussing this; the closest I found was boxplot line width. Any suggestions?

Sacci answered 26/6, 2015 at 0:30 Comment(12)
Can you link to the plot you're seeing? Seaborn boxplots take up about as much horizontal space as they could so I'm not sure what the problem could be.Following
Also if you can't share your actual data please try to share some code that will generate random data that reproduces the problem; doing so might also give you insight into what the issue is.Following
I can't post pictures, but I have a screenshot of it here. And a pickled dataframe that creates that plot when run with the code in my question can be downloaded from my dropbox.Sacci
It looks like the hue levels are perfectly nested within the x variable, I think that is your problem. Just remove hue="organ".Following
Also, the above screenshot was taken after running plt.yscale('log') to rescale the axis.Sacci
You're right, removing hue="organ" made all the boxes expand to fill the available width! Does this mean there's no way to use factorplot to color-code my projects by organ?Sacci
If you pass a color palette name to the palette keyword argument it will color the x variable.Following
Unfortunately, in this case color-coding by X won't help me, because each organ is associated with several projects. I was hoping to use grouped boxplots to make it clear which project is from which organ, but it looks like no matter which way I group things (either hue=organ or hue=project_id), the boxes end up too thin. Thank you for your help though!Sacci
...wait, I think I see what you mean. I can hard-code a "palette" which colors the projects by organ, and pass it into factorplot. Tedious, but it'll work! Thank you!Sacci
palette = df["organ"].map(pal_dict) where pal_dict has organs as keys and colors as values should do the trick.Following
That did, in fact, do the trick! I added a legend using the code from the last answer here, and everything's exactly how I imagined it :)Sacci
Would you mind elaborating on how you added a legend? I am having the same problem with seaborn boxplot. I solved it with the solution in this post (removing 'hue'), but I cannot seem to add a legend...Reamy
S
2

For future reference, here are the relevant bits of code that make the correct figure with legend: (obviously this is missing important things and won't actually run as-is, but hopefully it shows the tricky parts)

import matplotlib.pylab as pyp
import seaborn as sns

def custom_legend(colors,labels, legend_location = 'upper left', legend_boundary = (1,1)):
    # Create custom legend for colors
    recs = []
    for i in range(0,len(colors)):
        recs.append(mpatches.Rectangle((0,0),1,1,fc=colors[i]))
    pyp.legend(recs,labels,loc=legend_location, bbox_to_anchor=legend_boundary)

# Color boxplots by organ
organ_list = sorted(df_unique(grouped_samples,'type'))
colors = sns.color_palette("Paired", len(organ_list))
color_dict = dict(zip(organ_list, colors))
organ_palette = grouped_samples.drop_duplicates('id')['type'].map(color_dict)

# Plot grouped boxplot
g = sns.factorplot("id","num_mutations",data=grouped_samples, order=id_list, kind="box", size=7, aspect=3, palette=organ_palette)
sns.despine(left=True)
plot_setup_pre()
pyp.yscale('log')
custom_legend(colors,organ_list)    
Sacci answered 26/3, 2016 at 6:15 Comment(0)
M
4

When sns.boxplot is used adding dodge=False will solve this problem as of version 0.9.

sns.factorplot() has been deprecated since version 0.9, and has been replaced with catplot() which also has the dodge parameter.

Motion answered 22/10, 2018 at 20:34 Comment(0)
S
2

For future reference, here are the relevant bits of code that make the correct figure with legend: (obviously this is missing important things and won't actually run as-is, but hopefully it shows the tricky parts)

import matplotlib.pylab as pyp
import seaborn as sns

def custom_legend(colors,labels, legend_location = 'upper left', legend_boundary = (1,1)):
    # Create custom legend for colors
    recs = []
    for i in range(0,len(colors)):
        recs.append(mpatches.Rectangle((0,0),1,1,fc=colors[i]))
    pyp.legend(recs,labels,loc=legend_location, bbox_to_anchor=legend_boundary)

# Color boxplots by organ
organ_list = sorted(df_unique(grouped_samples,'type'))
colors = sns.color_palette("Paired", len(organ_list))
color_dict = dict(zip(organ_list, colors))
organ_palette = grouped_samples.drop_duplicates('id')['type'].map(color_dict)

# Plot grouped boxplot
g = sns.factorplot("id","num_mutations",data=grouped_samples, order=id_list, kind="box", size=7, aspect=3, palette=organ_palette)
sns.despine(left=True)
plot_setup_pre()
pyp.yscale('log')
custom_legend(colors,organ_list)    
Sacci answered 26/3, 2016 at 6:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.