matplotlib: box plot for each category
Asked Answered
V

2

6

My pandas data frame has two columns: category and duration. And I use the following code to make a box plot of all data points.

import matplotlib.pyplot as plt
plt.boxplot(df.duration)
plt.show()

However, if I want one box fore each category, how do I modify the above code? Thanks!

Vesuvian answered 9/2, 2018 at 18:47 Comment(0)
C
3

We can do it with pandas

#df=pd.DataFrame({'category':list('aacde'),'duration':[1,3,2,3,4]}) sample data
df.assign(index=df.groupby('category').cumcount()).pivot('index','category','duration').plot(kind='box')

enter image description here

Chitarrone answered 9/2, 2018 at 18:55 Comment(1)
How do I add outliers in this pandas version ? If not, how do we do it in matplotlib?Vesuvian
P
9

In addition to Wen's answer, which is spot on, you might want to check out the seaborn library. It was made to do this kind of plot.

Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.

Check the documentation for boxplots

Draw a box plot to show distributions with respect to categories.

sns.boxplot(data=df, x='category', y='duration')

enter image description here

Pegram answered 9/2, 2018 at 20:3 Comment(0)
C
3

We can do it with pandas

#df=pd.DataFrame({'category':list('aacde'),'duration':[1,3,2,3,4]}) sample data
df.assign(index=df.groupby('category').cumcount()).pivot('index','category','duration').plot(kind='box')

enter image description here

Chitarrone answered 9/2, 2018 at 18:55 Comment(1)
How do I add outliers in this pandas version ? If not, how do we do it in matplotlib?Vesuvian

© 2022 - 2024 — McMap. All rights reserved.