Consistent width of boxplots if missing data by group?
Asked Answered
F

1

22

I have a similar question previously discussed for barplots, but with missing solution for boxplots: Consistent width for geom_bar in the event of missing data

I would like to produce a boxplots by groups. However, data for some groups can be missing, leading to increased width of boxplots with missing groups.

I tried to specify geom_boxplot(width = value) or geom_boxplot(varwidth = F), but this does not work.

Also, as suggested for barplots example, I tried to add NA values for missing data group. Boxplot just only skipp missing data, and extent the boxplot width. I got back the warning:

Warning messages:
1: Removed 1 rows containing non-finite values (stat_boxplot). 

Dummy example:

# library
library(ggplot2)

# create a data frame
variety=rep(LETTERS[1:7], each=40)
treatment=rep(c("high","low"),each=20)
note=seq(1:280)+sample(1:150, 280, replace=T)

# put data together
data=data.frame(variety, treatment ,  note)

ggplot(data, aes(x=variety, y=note, fill=treatment)) + 
  geom_boxplot()

Boxplots have the same width if there are values for each group:

boxplots have the same width is there are values for each group

Remove the values for 1 group:

# subset the data to have a missing data for group:
data.sub<-subset(data, treatment != "high" | variety != "E" )

windows(4,3)
ggplot(data.sub, aes(x=variety, y=note, fill=treatment)) + 
  geom_boxplot()

Boxplot with missing data is wider than another ones:

enter image description here


Is there a way how to keep constant width of boxplots?

Fiddlefaddle answered 31/8, 2018 at 21:53 Comment(0)
A
42

We can make use of the preserve argument in position_dodge / position_dodge2.

From ?position_dodge

preserve: Should dodging preserve the total width of all elements at a position, or the width of a single element?

p <- ggplot(data.sub, aes(x=variety, y=note, fill=treatment))

# position_dodge
p + 
  geom_boxplot(position = position_dodge(preserve = "single"))

enter image description here

# position_dodge2
p + 
  geom_boxplot(position = position_dodge2(preserve = "single"))

enter image description here

Allopathy answered 7/9, 2018 at 6:39 Comment(2)
Improvement but not entirely satisfying since empty space should be on the left in this caseMalacostracan
Thanks @sindri_baldur for the heads up. Added the position_dodge2 optionAllopathy

© 2022 - 2024 — McMap. All rights reserved.