Implementing paired lines into boxplot.ggplot2
Asked Answered
A

3

7

I have a set of paired data, and I'm using ggplot2.boxplot (of the easyGgplot2 package) with added (jittered) individual data points:

ggplot2.boxplot(data=INdata,xName='condition',yName='vicarious_pain',groupName='condition',showLegend=FALSE,
  position="dodge",
  addDot=TRUE,dotSize=3,dotPosition=c("jitter", "jitter"),jitter=0.2,
  ylim=c(0,100),
  backgroundColor="white",xtitle="",ytitle="Pain intenstity",mainTitle="Pain intensity",
  brewerPalette="Paired")

INdata:

ID,condition,pain
1,Treatment,4.5
3,Treatment,12.5
4,Treatment,16
5,Treatment,61.75
6,Treatment,23.25
7,Treatment,5.75
8,Treatment,5.75
9,Treatment,5.75
10,Treatment,44.5
11,Treatment,7.25
12,Treatment,40.75
13,Treatment,17.25
14,Treatment,2.75
15,Treatment,15.5
16,Treatment,15
17,Treatment,25.75
18,Treatment,17
19,Treatment,26.5
20,Treatment,27
21,Treatment,37.75
22,Treatment,26.5
23,Treatment,15.5
25,Treatment,1.25
26,Treatment,5.75
27,Treatment,25
29,Treatment,7.5
1,No Treatment,34.5
3,No Treatment,46.5
4,No Treatment,34.5
5,No Treatment,34
6,No Treatment,65
7,No Treatment,35.5
8,No Treatment,48.5
9,No Treatment,35.5
10,No Treatment,54.5
11,No Treatment,7
12,No Treatment,39.5
13,No Treatment,23
14,No Treatment,11
15,No Treatment,34
16,No Treatment,15
17,No Treatment,43.5
18,No Treatment,39.5
19,No Treatment,73.5
20,No Treatment,28
21,No Treatment,12
22,No Treatment,30.5
23,No Treatment,33.5
25,No Treatment,20.5
26,No Treatment,14
27,No Treatment,49.5
29,No Treatment,7

The resulting plot looks like this:

enter image description here

However, since this is paired data, I want to represent this in the plot - specifically to add lines between paired datapoints. I've tried adding

... + geom_line(aes(group = ID))

..but I am not able to implement this into the ggplot2.boxplot code. Instead, I get this error:

Error in if (addMean) p <- p + stat_summary(fun.y = mean, geom = "point", : argument is not interpretable as logical In addition: Warning message: In if (addMean) p <- p + stat_summary(fun.y = mean, geom = "point", : the condition has length > 1 and only the first element will be used

Grateful for any input on this!

Actinouranium answered 19/3, 2018 at 19:15 Comment(4)
Perhaps it's useful to mention that you're using the easyGgplot2 package to make this boxplot?Bigamy
yes, sorry, I forgot to mention that ggplot2.boxplot is part of the easyGgplot2 package. I have edited to include this now. thanksActinouranium
Related: add geom_line to link all the geom_point in boxplot conditioned on a factor with ggplot2Liva
Thanks for linking, very useful. Also related, "gg_jitterbox" makes half and half plots - box and dots - which might help make things a bit less busy when combining box, dots, and lines in the same plot: gist.github.com/naupaka/d9b003308e4aa66e34f93d492428e0a2Actinouranium
C
13

I do not know the package from which ggplot2.boxplot comes from but I will show you how perform the requested operation in ggplot.

The requested output is a bit problematic for ggplot since you want both points and lines connecting them to be jittered by the same amount. One way to perform that is to jitter the points prior making the plot. But the x axis is discrete, here is a workaround:

b <- runif(nrow(df), -0.1, 0.1)

ggplot(df) +
  geom_boxplot(aes(x = as.numeric(condition), y = pain, group = condition))+
  geom_point(aes(x = as.numeric(condition) + b, y = pain)) +
  geom_line(aes(x  = as.numeric(condition) + b, y = pain, group = ID)) +
  scale_x_continuous(breaks = c(1,2), labels = c("No Treatment", "Treatment"))+
  xlab("condition")

enter image description here

First I have made a vector to jitter by called b, and converted the x axis to numeric so I could add b to the x axis coordinates. Latter I relabeled the x axis.

I do agree with eipi10's comment that the plot works better without jitter:

ggplot(df, aes(condition, pain)) +
  geom_boxplot(width=0.3, size=1.5, fatten=1.5, colour="grey70") +
  geom_point(colour="red", size=2, alpha=0.5) +
  geom_line(aes(group=ID), colour="red", linetype="11") +
  theme_classic()

enter image description here

and the updated plot with jittered points eipi10 style:

ggplot(df) +
  geom_boxplot(aes(x = as.numeric(condition),
                   y = pain,
                   group = condition),
               width=0.3,
               size=1.5,
               fatten=1.5,
               colour="grey70")+
  geom_point(aes(x = as.numeric(condition) + b,
                 y = pain),
             colour="red",
             size=2,
             alpha=0.5) +
  geom_line(aes(x  = as.numeric(condition) + b,
                y = pain,
                group = ID),
            colour="red",
            linetype="11") +
  scale_x_continuous(breaks = c(1,2),
                     labels = c("No Treatment", "Treatment"),
                     expand = c(0.2,0.2))+
  xlab("condition") +
  theme_classic()

enter image description here

Compound answered 19/3, 2018 at 19:38 Comment(2)
GIven how busy this plot is, removing the jitter probably works better: ggplot(INdata, aes(condition, pain)) + geom_boxplot(width=0.3, size=1.5, fatten=1.5, colour="grey70") + geom_point(colour="red", size=2, alpha=0.5) + geom_line(aes(group=ID), colour="red", linetype="11") + theme_classic()Kalat
Thanks, brilliant. I agree that it probably gets too busy with boxplot, jittered dots, and lines. I'll likely go with nonjittered, but thanks for sharing both versions (although black boxes and light grey (thin) lines helped a bit for the jittered version)Actinouranium
B
1

Although I like the oldschool way of plotting with ggplot as shown by @missuse's answer, I wanted to check whether using your ggplot2.boxplot-based code this was also possible.

I loaded your data:

'data.frame':   52 obs. of  3 variables:
 $ ID       : int  1 3 4 5 6 7 8 9 10 11 ...
 $ condition: Factor w/ 2 levels "No Treatment",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ pain     : num  4.5 12.5 16 61.8 23.2 ...

And called your code, adding geom_line at the end as you suggested your self:

ggplot2.boxplot(data = INdata,xName = 'condition', yName = 'pain', groupName = 'condition',showLegend = FALSE,
                position = "dodge",
                addDot = TRUE, dotSize = 3, dotPosition = c("jitter", "jitter"), jitter = 0,
                ylim = c(0,100),
                backgroundColor = "white",xtitle = "",ytitle = "Pain intenstity", mainTitle = "Pain intensity",
                brewerPalette = "Paired") + geom_line(aes(group = ID))

Note that I set jitter to 0. The resulting graph looks like this: R plot

If you don't set jitter to 0, the lines still run from the middle of each boxplot, ignoring the horizontal location of the dots.

Not sure why your call gives an error. I thought it might be a factor issue, but I see that my ID variable is not factor class.

Bigamy answered 19/3, 2018 at 20:29 Comment(1)
thanks, this was very useful. I've been trying to figure out what went wrong before, but haven't been able to reproduce it (it works now).. Anyway, I was able to align the dots and lines by implementing missuse's jitter solution to this (see posted answer)Actinouranium
A
1

I implemented missuse's jitter solution into the ggplot2.boxplot approach in order to align the dots and lines. Instead of using "addDot", I had to instead add dots using geom_point (and lines using geom_line) after, so I could apply the same jitter vector to both dots and lines.

b <- runif(nrow(df), -0.2, 0.2)

ggplot2.boxplot(data=df,xName='condition',yName='pain',groupName='condition',showLegend=FALSE,
      ylim=c(0,100),
      backgroundColor="white",xtitle="",ytitle="Pain intenstity",mainTitle="Pain intensity",
      brewerPalette="Paired") +
      geom_point(aes(x=as.numeric(condition) + b, y=pain),colour="black",size=3, alpha=0.7) +
      geom_line(aes(x=as.numeric(condition) + b, y=pain, group=ID), colour="grey30", linetype="11", alpha=0.7)

enter image description here

Actinouranium answered 20/3, 2018 at 14:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.