faster way to ggplot/ggarrange?
Asked Answered
G

1

0

The time needed to knit my markdown document with pdf output makes me crazy. Is there any way to accelerate the plotting and arranging of plots? ggarrange seems already faster than grid.arrange from what i can see.

data=1:5
df <- data.frame(data)
myplot <- ggplot(df)+geom_line(aes(x=data, y = data))

microbenchmark(grid.arrange(myplot, myplot), ggarrange(myplot, myplot), times=3)
Unit: milliseconds
                         expr      min       lq     mean   median       uq      max neval
 grid.arrange(myplot, myplot) 107.0948 117.6475 153.5636 128.2002 176.7980 225.3959     3
    ggarrange(myplot, myplot)  49.5275  49.5631 120.6860  49.5987 156.2653 262.9318    
Gymnosperm answered 9/8, 2022 at 23:58 Comment(5)
I highly recommend the {targets} (github.com/ropensci/targets) package if you need to report the results of long-running tasks. Given that an Rmarkdown document will most likely be reknitted several times while it is being worked on, it is not really optimal to include heavy jobs in it. If you don't have time to learn how the targets package works, you could create scripts outside of your Rmarkdown for the heavy tasks, run them and just import their results in the document. In the case of plots, you could ggsave() them and then import them in the Rmd with knitr::include_graphics().Timberhead
You could also look into the {patchwork} package (github.xiaoyu.ge/thomasp85/patchwork) and see if it's faster.Timberhead
If you save the file with ggsave() and then import it into the text body, it can help. Additionally, consider svglite depending upon whether you're using raster or vector images.Stalingrad
"The time needed to knit my markdown document with pdf output makes me crazy" I would expect that what consumes most of the time is actually building and printing the plots, i.e., ggplot2. I would expect the overhead from combining the plots to be negligible. A common issue is large data, which means either plotting large numbers of points and lines (redesign your plots, there is a lot of overplotting anyway) or calculating the statistics for plotting (inefficient within ggplot2, do it before plotting, e.g., with package data.table).Ringside
Depending on use case cache=TRUE might help you out as well.Scapolite
U
1

I suspect the reason patchwork is so fast in your answer @gaut is that the plots aren't being printed, they are just being saved to a variable (called "patchwork"); if you print the plots the benchmark looks quite different:

library(tidyverse)
library(gridExtra)
#> 
#> Attaching package: 'gridExtra'
#> The following object is masked from 'package:dplyr':
#> 
#>     combine
library(ggpubr)
library(microbenchmark)
library(patchwork)
library(cowplot)
#> 
#> Attaching package: 'cowplot'
#> The following object is masked from 'package:patchwork':
#> 
#>     align_plots
#> The following object is masked from 'package:ggpubr':
#> 
#>     get_legend

data=1:5
df <- data.frame(data)


myplot <- ggplot(df) +
  geom_line(aes(x=data, y=data))

res <- microbenchmark(grid.arrange(myplot, myplot),
                      ggarrange(myplot, myplot, ncol = 1),
                      print(myplot / myplot),
                      plot_grid(myplot, myplot, ncol = 1),
                      times = 10)
autoplot(res)
#> Coordinate system already present. Adding new coordinate system, which will replace the existing one.

Created on 2022-08-10 by the reprex package (v2.0.1)

Ulland answered 10/8, 2022 at 0:35 Comment(5)
ggarrange is still the fastest darnGymnosperm
I think you'll need to benchmark the entire Rmd render to get a true sense of whether ggarrange() or cowplot::plot_grid() are actually faster. Also, base R may be faster depending on how complex your plots are.Ulland
Which of the loaded packages defines a / method for ggplot objects?Ringside
Sorry Roland; it's very unclear the way I've written it - that's patchwork syntax (docs)Ulland
Interesting, thanks.Ringside

© 2022 - 2024 — McMap. All rights reserved.