Set y-axis scale for pandas Dataframe Boxplot(), 3 Deviations?
Asked Answered
T

3

5

I'm trying to make a single boxplot chart area per month with different boxplots grouped by (and labeled) by industry and then have the Y-axis use a scale I dictate.

In a perfect world this would be dynamic and I could set the axis to be a certain number of standard deviations from the overall mean. I could live with another type of dynamically setting the y axis but I would want it to be standard on all the 'monthly' grouped boxplots created. I don't know what the best way to handle this is yet and open to wisdom - all I know is the numbers being used now are way to large for the charts to be meaningful.

I've tried all kinds of code and had zero luck with the scaling of axis and the code below was as close as I could come to the graph.

Here's a link to some dummy data: https://drive.google.com/open?id=0B4xdnV0LFZI1MmlFcTBweW82V0k

And for the code I'm using Python 3.5:

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('TkAgg')
import pylab    
df =  pd.read_csv('Query_Final_2.csv')
df['Ship_Date'] = pd.to_datetime(df['Ship_Date'], errors = 'coerce')
df1 = (df.groupby('Industry'))
print(
df1.boxplot(column='Gross_Margin',layout=(1,9), figsize=(20,10), whis=[5,95])
,pylab.show()
)
Twentyfourmo answered 30/11, 2016 at 15:36 Comment(1)
I like how you called the plotting function on a pandas.core.groupby.DataFrameGroupBy object, I did not know that was possible.Dirichlet
D
14

Here is a cleaned up version of your code with the solution:

import pandas as pd
import matplotlib.pyplot as plt

df =  pd.read_csv('Query_Final_2.csv')
df['Ship_Date'] = pd.to_datetime(df['Ship_Date'], errors = 'coerce')
df1 = df.groupby('Industry')

axes = df1.boxplot(column='Gross_Margin',layout=(1,9), figsize=(20,10),
                   whis=[5,95], return_type='axes')
for ax in axes.values():
    ax.set_ylim(-2.5, 2.5)

plt.show()

The key is to return the subplots as axes objects and set the limits individually.

Dirichlet answered 30/11, 2016 at 18:17 Comment(0)
O
5

Once you have established variables for the mean and the standard deviation, use:

plt.ylim(ymin, ymax)

to set the y-axis.

Octosyllable answered 17/12, 2019 at 21:7 Comment(0)
P
-1

Thanks @Padraig, Please notice if you are using plt as a figure without subplot, you can use:

plt.ylim(ymin, ymax)

But if you want to adjust Y-axis of one sub plot this one works (@AlexG)

ax.set_ylim(ymin, ymax)

for instance if your subplot is ax2, and you want to have Y-axis from 0.5 to 1.0 your code will be like this:

ax2.set_ylim(0.5, 1.0)
Parsimonious answered 9/6, 2020 at 12:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.