Overlaying the numeric value of median/variance in boxplots
Asked Answered
W

2

17

When using box plots in Python, is there any way to automatically/easily overlay the value of the median & variance on top of each box (or at least the numerical value of the median)?

E.g. in the boxplot below, I would like to overlay the text (median, +- std) on each box plot.

                              enter image description here

Whig answered 17/9, 2013 at 22:42 Comment(0)
J
30

Assuming you are using the boxplot function to draw the boxplots, it returns a dictionary that holds the components of the graph. Note that the box represent the inner quartile range (25 to 75th percentile) and not the standard deviation.

>>> bp_dict = boxplot(data, vert=False) # draw horizontal boxplot
>>> bp_dict.keys()
>>> bp_dict.keys()
['medians', 'fliers', 'whiskers', 'boxes', 'caps']

These contain the Line2D objects that form each of the plot elements. You can use the Line2D.get_xydata method to get the median and box positions (in data coords) to figure out where to position your text.

from pylab import *

# from http://matplotlib.org/examples/pylab_examples/boxplot_demo.html

# fake up some data
spread= rand(50) * 100
center = ones(25) * 50
flier_high = rand(10) * 100 + 100
flier_low = rand(10) * -100
data =concatenate((spread, center, flier_high, flier_low), 0)

# fake up some more data
spread= rand(50) * 100
center = ones(25) * 40
flier_high = rand(10) * 100 + 100
flier_low = rand(10) * -100
d2 = concatenate( (spread, center, flier_high, flier_low), 0 )
data.shape = (-1, 1)
d2.shape = (-1, 1)
#data = concatenate( (data, d2), 1 )
# Making a 2-D array only works if all the columns are the
# same length.  If they are not, then use a list instead.
# This is actually more efficient because boxplot converts
# a 2-D array into a list of vectors internally anyway.
data = [data, d2, d2[::2,0]]

# multiple box plots on one figure
figure()

# get dictionary returned from boxplot
bp_dict = boxplot(data, vert=False)

for line in bp_dict['medians']:
    # get position data for median line
    x, y = line.get_xydata()[1] # top of median line
    # overlay median value
    text(x, y, '%.1f' % x,
         horizontalalignment='center') # draw above, centered

for line in bp_dict['boxes']:
    x, y = line.get_xydata()[0] # bottom of left line
    text(x,y, '%.1f' % x,
         horizontalalignment='center', # centered
         verticalalignment='top')      # below
    x, y = line.get_xydata()[3] # bottom of right line
    text(x,y, '%.1f' % x,
         horizontalalignment='center', # centered
             verticalalignment='top')      # below

show()

boxplot output

Jujitsu answered 17/9, 2013 at 23:51 Comment(6)
As a side comment annotate is a bit more flexible than text.Mannered
Thanks! BTW, I think the code is missing a couple of lines at the bottom of it. Is that possible?Whig
@tcaswell I am interested in your comment. How would one use annotate instead of text here? I haven't used it before.Whig
@josh matplotlib.org/api/axes_api.html#matplotlib.axes.Axes.annotate matplotlib.org/examples/pylab_examples/annotation_demo.htmlMannered
Sorry about the chopped off lines @Josh. I've added them. tcaswell makes a good recommendation with annotate. I had looked at adding a "bbox" argument to text in order to pad around the text, but wasn't getting the desired result. You could use annotate to create an offset for instance. (You could also just add a delta to the y argument to text.)Jujitsu
In case you are using the patch_artist=True option when creating your boxplot, the code needs to be changed a little: replace calls to line.get_xydata() with line.get_path().vertices.Diallage
P
-1

A little correction:

for line in bp_dict['medians']:
  # get position data for median line
  x, y = line.get_xydata()[1]  # top of median line
  # overlay median value
  text(x, y, '%.1f' % x, horizontalalignment='center')  # draw above, centered

for box in bp_dict['boxes']:
  x, y = box.get_path().vertices[0]  # bottom of left line
  text(x, y, '%.1f' % x, horizontalalignment='center',  # centered
  verticalalignment='top')      # below
  x, y = box.get_path().vertices[6]  # bottom of right line
  text(x, y, '%.1f' % x,
    horizontalalignment='center',  # centered
    verticalalignment='top')      # below
Precentor answered 14/3, 2021 at 16:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.