dismantling a ggplot with grid and gtable
Asked Answered
S

1

2

I'm struggling to build a dual-axis plot based on ggplot objects. At baptiste's suggestion, I have broken down the problem into smaller parts. The present issue is:

  1. how to remove all of the data from the grobs, while keeping the axis, the axis labels, the axis tickmarks, and the grid lines? By 'data' I mean the data associated with geom_line() and geom_points().

The reason for wanting to do this is that in the process of building a dual-axis plot, I ran into the following problem: the grid lines from one grob would overwrite the data from the other grob. By removing the data lines and points, there would be no overwriting.

Let me be clear: I do have workarounds that consist in adding a linetype to the aes() and setting scale_linetype_manual(values = c("solid", "blank") or, alternatively, sending the data 'off the grid', but I would like to post-process a plot object that has not been 'groomed' too much for the purpose at hand.

Below some code and figures.

# Data
df <- structure(list(Year = c(1950, 2013, 1950, 2013), Country = structure(c(1L, 
1L, 2L, 2L), .Label = c("France", "United States"), class = "factor"), 
Category = c("Hourly minimum wage", "Hourly minimum wage", 
"Hourly minimum wage", "Hourly minimum wage"), value = c(2.14, 
9.43, 3.84, 7.25), variable = c("France (2013 euros)", 
"France (2013 euros)", "United States (2013 dollars)", "United States (2013 dollars)"
), Unit = c("2013 euros", "2013 euros", "2013 dollars", "2013 dollars"
)), .Names = c("Year", "Country", "Category", "value", "variable", 
"Unit"), row.names = c(NA, 4L), class = "data.frame")

# Plot data with ggplot
library(ggplot2)
p <- ggplot(data = df, aes(x = Year, y = value, group = variable, colour = variable, shape = variable)) + 
geom_line(size = 2) + 
geom_point(size = 4) +
theme(panel.grid.major = element_line(size = 1, colour = "darkgreen"), 
      panel.grid.minor = element_line(size = 1, colour = "darkgreen", linetype = "dotted"))

# Manipulate grobs with gtable
library(gtable)
g <- ggplot_gtable(ggplot_build(p))
## Here remove the geom_line() and geom_point()
## g <- stripdata(g)  # pseudo-code!
grid.newpage()
grid.draw(g)

In the plot below, I would like the lines gone!

enter image description here

Edit: Following baptiste's suggestion, I attempt to remove the data layers. However, as BondedDust points out in the comments section, this breaks the ggplot object:

# Remove the two layers of data
p$layers[[1]] <- NULL
p$layers[[1]] <- NULL
g <- ggplot_gtable(ggplot_build(p))
## Error: No layers in plot

Removing the data from the ggplot object destroys it. One workaround I have used in applications is to 'send the data off the grid', e.g. multiplying each cell by -999999 and cuting off the display with + scale_y_continuous(limits = c(1, 10)), but I'd like to avoid this ugly hack, if feasible. I was hoping that the associated gtable would not be destroyed if I replaced every data point with NA or NULL, so that's why I was looking for a way to remove the data from inside the g object, rather than the p object.

After manipulation of the grobs (rather than hacking directly the ggplot object), the result of grid.draw(g) would be:

enter image description here

FYI, the second plot was obtained with the following workaround.

p <- ggplot(data = within(df, value <- -999999), aes(x = Year, y = value, group = variable, colour = variable, shape = variable)) + 
geom_line() + 
geom_point() +
theme(panel.grid.major = element_line(size = 1, colour = "darkgreen"), 
      panel.grid.minor = element_line(size = 1, colour = "darkgreen", linetype = "dotted")) +
scale_y_continuous(limits = c(1, 10))
Spectacled answered 3/1, 2015 at 1:12 Comment(10)
I really don't understand why you'd need gridExtra here, though admittedly I didn't read the whole thingSharpsighted
suggestion: instead of presenting a very complex multi-facetted problem, why don't you try to isolate specific questions, each with a minimal example and as a separate question?Sharpsighted
p2$layers[[1]] <- NULL will remove the thin green line, I don't know if that helpsSharpsighted
alternatively, names(ggplotGrob(p2)[["grobs"]][[4]][["children"]]) could help identify the children you want to remove, since you can't have a ggplot with no layers.Sharpsighted
@baptiste, thanks for the suggestions, I'll follow up ASAP!Spectacled
@PatrickT: But, but, but, .... the whole idea behind grid graphics is to include the data necessary to draw the graphic along with the graphical annotations. If you remove the data, the graphic will fall apart. Unless, of course, all you want is a scaffold.Picaroon
@baptiste, thanks for trying to help out with what was a messy and poorly structured question! I have selected one of my problems and focused the question on that. I will probably follow up with the other questions I had. As I write in my edit, removing the data layers in the ggplot 'breaks' it. (as you correctly say, the line can be removed, but if I also remove the points I run into trouble) For this reason I was trying to work directly on the gtable, assuming that would not break after alterations.Spectacled
@BondedDust, you're right! see my edit. And yes, all I want is a 'scaffold', if feasible.Spectacled
@baptiste, about my use of gridExtra, my (now edited out) function returns arrangeGrob(g) because that way I was able to assign the returned object to a name: p <- ggplot_axis_dual(p1,p2,p3) and print with ggsave, whereas if my function were made to return just grid.draw(g) I wasn't able to assign to a name. I think. For some reason. ;-)Spectacled
@Spectacled it seems a bit overkill to use arrangeGrob just for that purpose (admittedly, my younger foolish self may have suggested it in the past). What you probably want is to define your own class, and define a print method for it (see gridExtra::print.arrange) so that ggsave accepts it.Sharpsighted
S
5

A more natural strategy would be to use invisible geom_blank layers, so that ggplot2 still trains the scales etc to build the plot but shows no data. Since you want to process already-formatted plots, however, you probably have to remove manually those grobs from the plot gTree. Here's an attempt,

library(gtable)
g <- ggplotGrob(p)

stripdata <- function(g){
  keep <- grepl("border|grill", 
                names(g[["grobs"]][[4]][["children"]]))
  g[["grobs"]][[4]][["children"]][!keep] <- NULL
  g
}

grid.newpage()
grid.draw(stripdata(g))
Sharpsighted answered 3/1, 2015 at 12:53 Comment(8)
Thanks! follow-up question: is the gTree[GRID.gTree.72] always in the 4th position? Is there a way to call it by name? This is in case it appeared in a different position in another plot object. Can I do something along the lines of g[["grobs"]]$gTree or g[["grobs"]]$GRID (to call it by an abbreviated name)?Spectacled
you probably should find its position in g$grobs from the layout, which(g[["layout"]][,"name"] == "panel")Sharpsighted
Great: helps me understand how the grobs are designed!Spectacled
FWIW I briefly described (what I know of) gtables in this wiki page: github.com/baptiste/gtable/wiki/DescriptionSharpsighted
Great! That's a fantastic tutorial. Will read it carefully. Question: are ggplotGrob(p) and ggplot_gtable(ggplot_build(p)) the same thing? They look the same. But the former function depends on ggplot2 only while the latter requires also gtable?Spectacled
Hadley included ggplotGrob in ggplot2 at my request, but it's doing the same thing internally (within its namespace so it doesn't need to attach gtable).Sharpsighted
I see now what you mean by geom_blanks(). I wasn't aware of the existence of that empty layer function. At any rate, it was very useful for me to learn how to do it via the grobs. Thanks.Spectacled
@Sharpsighted your wiki seems to be a 404 these days - does it live anywhere else?Dashpot

© 2022 - 2024 — McMap. All rights reserved.