Intro
I want to draw lines between a faceted ggplot. The main goal is to connect those measurements with a line which we want to test against. So basically I want to insert some kind of significance bars inside and between the facets of a ggplot boxplot (or any kind of plot for that matter).
Research
ggsignif
I know that there is ggsignif
package which does this for all non faceted plots.
There are answers which try to circumvent this drawback Using ggsignif with grouped bar graphs and facet_wrap not working.
Using ggplot_build
There is an approach which could be modified for my purpose but a major drawback with the solution of this question ggplot, drawing line between points across facets that one has to specify the l
parameter of gtable_add_grob
manually. I was not able to figure out how one could automate the l
parameter using only the name of the facet panel we want to start end end with. Since $layout$name
only hold some arbitrary names like "panel-1-1" which is the actual panel name but how would one get from that to the l
parameter which is needed to specify the start and end of the line
Background
I'd like to automate the process of drawing lines between faceted plots, for more insight see my post about Valid Comparisons of Multiple Grouping Variables. In the end I want to use this to automatically annotate plots to visualize which are valid comparisons and potentially also add significance bars to the plot likewise ggsignif but with faceted plots.
Examples
Create mockup data
So this is the data we are working on:
# Create a dummy dataframe
# Create a dummy dataframe
df <- expand.grid(
St= 1:10,
MAT= c("A", "B", "C"),
TREAT= factor(1:2)
)
df$St<- rnorm(nrow(df))
df$OPERATOR<- rep(c("TM", "CX"), each = 5, length.out = nrow(df))
# numbers are randomly generated, so this is different each time
head(df)
St | MAT | TREAT | OPERATOR | |
---|---|---|---|---|
1 | -0.488805635 | A | 1 | TM |
2 | 2.658658027 | A | 1 | TM |
3 | 1.680278205 | A | 1 | TM |
4 | 0.779584009 | A | 1 | TM |
5 | 0.713240520 | A | 1 | TM |
6 | -0.542881937 | A | 1 | CX |
Example Plot
this results in the following plot:
# ggplot with multiple facets (nested)
p <- ggplot(data = df,
aes(x = TREAT,
y = St,
color = MAT))+
geom_boxplot() +
ggh4x::facet_nested(~ MAT + OPERATOR) +
theme_classic()
p
What I like to have
I want to draw lines between the facets like this. With the ability to draw lines inside a facet (blue) or even subfacet (green) and between different facets (black). The y-Position is here arbetrary chosen but should be similar to the ordering here. The plot is made with inkscape.
As you can see we have a simple facet_nested boxplot with lines between some data points to demonstrate between what data I want to draw horizontal lines.
What I tried
Using google and some LLM (chatGPT or Bing) i was able to create some code to automatically select the facets and subfacets.
# Build the plot
gb <- ggplot_build(p)
# Get panel parameters
ranges <- gb$layout$panel_params
# Get npc position of a specific facet and subfacet
mat_name1 <- "A" # first facet
operator_name1 <- "TM" # first subfacet
# vs
mat_name2 <- "B" # first facet
operator_name2 <- "TM" #first subfacet
# x axis name
x_name <- "1"
# Find the index of the panel that corresponds to the specified facet and subfacet
# TO DO change MAT and OPERATOR with strings so it is adaptable to the grouping column names
panel_index1 <- which(gb$layout$layout$MAT == mat_name1 & gb$layout$layout$OPERATOR == operator_name1)
panel_index2 <- which(gb$layout$layout$MAT == mat_name2 & gb$layout$layout$OPERATOR == operator_name2)
This returns the correct number of the facet or subfacet panel number (in this ca 1 and 3) . But I was not able to extract the exact coordinate of A-TM-1 vs B-TM-1 (especially the 1).
In addition I don't know how to use that information to to draw the lines on top of the plot.
Outro
I hope someone can grasp what I want to achieve and help me understand the logic of the ggplot_build information to extract the positions and how to use that info to draw lines ontop of the plot.
Idealy I can write a function which takes a set of facet, subfacet(s) combinations including the x axis location (in this case 1 or 2 or any other x axis labels) to draw those lines between the boxes. ( e.g. c(c('A', 'TM','1'), c('B', 'TM','1'))
But a more general understanding how to interpret the output of ggplot_build and the construction of those plot would be wonderfull.
Most promising looks the approach of ggplot, drawing line between points across facets but there I struggle in automatically selection the right l
parameter.
It would be a great deal of help if someone can point me in the right direction since this kind of plots are day to day buisness and it would save a lot of time and errors if I could automate that process like ggsignif
.
If you need any additional information, please don't hesitate to ask.
Best
TMC