Barplot with significant differences and interactions?
Asked Answered
D

3

11

I would like to visualize my data and ANOVA statistics. It is common to do this using a barplot with added lines indicating significant differences and interactions. How do you make plot like this using R?

This is what I would like:

Significant differences:

significant differences

Significant interactions:

significant interactions

Background

I am currently using barplot2{ggplots} to plot bars and confidence intervals, but I am willing to use any package/procedure to get the job done. To get the statistics I am currently using TukeyHSD{stats} or pairwise.t.test{stats} for differences and one of the anova functions (aov, ezANOVA{ez}, gls{nlme}) for interactions.

Just to give you an idea, this is my current plot: barplot2 with CIs

Defamatory answered 20/3, 2013 at 22:7 Comment(5)
there is a plot.cld function in multcomp, where you can put letters above your bars indicating significance. Perhabs this is also something for you...Ventris
There is also bar.group from the agricolae package which puts the letters on for you.Leister
If you use base R's barplot, you can store the centre points of bars like barstore <- barplot(1:3). To verify, that this works, try abline(v=barstore) and note that the vertical lines all cut through the centre of the bars. Using segments you may be able to use these stored points to draw your comparison/interaction lines.Tonita
Not a barplot, but a neat way to visualize the results of an ANOVA is here: stats.stackexchange.com/a/28155/7744Catherine
#2286585Consonantal
G
9

As you are using function barplot2() from library gplots, will give example using this approach.

First, made barplot as given in help file of barplot2() function. ci.l and ci.u are fake confidence interval values. Barplot should be saved as object.

hh <- t(VADeaths)[1:2, 5:1]
mybarcol <- "gray20"
ci.l <- hh * 0.85
ci.u <- hh * 1.15
mp <- barplot2(hh, beside = TRUE,
               col = c("grey12", "grey82"),
               legend = colnames(VADeaths)[1:2], ylim = c(0, 100),
               cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u)

If you look on object mp, it contains x coordinates for all bars.

 mp
     [,1] [,2] [,3] [,4] [,5]
[1,]  1.5  4.5  7.5 10.5 13.5
[2,]  2.5  5.5  8.5 11.5 14.5

Now I use upper confidence interval values to calculate coordinates for y values of segments. Segments will start at position that is 1 higher then the end of confidence intervals. y.cord contains four rows - first and second row correspond to first bar and other two rows to second bar. Highest y value is calculated from the maximal values of confidence intervals for each bar pair. x.cord values just repeat the same values which are in mp object, each 2 times.

y.cord<-rbind(c(ci.u[1,]+1),c(apply(ci.u,2,max)+5),
          c(apply(ci.u,2,max)+5),c(ci.u[2,]+1))
x.cord<-apply(mp,2,function(x) rep(x,each=2))

After barplot is made use sapply() to make five line segments (because this time there are 5 groups) using calculated coordinates.

sapply(1:5,function(x) lines(x.cord[,x],y.cord[,x]))

To plot texts above the segments calculate x and y coordinates, where x is middle point of two bar x values and y value is calculated from the maximal values of confidence intervals for each bar pair plus some constant. Then use function text() to add information.

x.text<-colMeans(mp)
y.text<-apply(ci.u,2,max)+7
text(c("*","**","***","NS","***"),x=x.text,y=y.text)

enter image description here

Gunner answered 21/3, 2013 at 13:2 Comment(0)
G
3

I guess that now your question has been more or less addressed, so I will instead encourage you to use different method that is much better in visual representation of your data - dotplots. As an example compare your barplot to the dotplot constructed with similar data points:

#example data similar to your barplot
d <- data.frame(group=rep(c("control","group1","group2"),each=4),
                esker=c(1.6,1.4,1.8,1.5,2,1.8,1.6,1.4,2.3,2,1.7,1.4),
                se=rep(0.1,12),
                cond=rep(c("t1","t2","t3","t4"),3))
#dotplot - you need Hmisc library for version with error bars
library(Hmisc)
Dotplot(cond ~ Cbind(esker, esker+se, esker-se) | group, data=d, col=1, 
        layout=c(1,3), aspect="xy",
        par.settings = list(dot.line=list(lwd=0), plot.line=list(col=1)))

enter image description here

Compare it to barplot. In the dotplot, it's much easier to see the differences when plotted horizontally, you don't need extra legend or bars or colours to show you the conditions, you don't need the guidelines and other noisy elements. You have everything contained within those three panels. Of course, I understand that you might want to highlight your significant effects, and that maybe it works fine for a small number of conditions. But if number of factor increases, the plot would overflow with stars and shit.

Keep it simple. Keep it dotplot. Check William Cleveland and Edward Tufte books for more on this.

Groundsel answered 21/3, 2013 at 16:53 Comment(0)
G
0

I recommend using ggplot instead of barplot, and you can build the lines manually like this:

This is starting with a data.table like the following: data.table used

gg <- ggplot(data, aes(x = time, y = mean, fill = type)) +
    geom_bar(stat = "identity", position = "dodge") +
    scale_fill_manual(values = c("RGX" = "royalblue2", "EX" = "tomato2")) +
    xlab("Post-treatment Time Point (months)") +
    ylab(paste("data", "Change Score")) +
    scale_y_continuous(expand = c(0, 0)) +
    ylim(c(0,max(data$mean*1.5)))

# add horizontal bars
gg <- gg + geom_errorbar(aes(ymax = hline, ymin = hline), width = 0.45)

# add vertical bars
gg <- gg + geom_linerange(aes(ymax = max(data$mean)+3, ymin = max(data$mean)+1), position = position_dodge(0.9))

# add asterisks   
gg <- gg + geom_text(data = data[1:2], aes(y = max(data$mean)+4), label = ifelse(data$p_value[1:2] <= 0.4, "*", ifelse(data$p_value[1:2] <= 0.05, "*", "")), size = 8)

gg

plot output

Glyptograph answered 11/1, 2017 at 23:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.