How to use a variable to specify column name in ggplot
Asked Answered
W

6

206

I have a ggplot command

ggplot( rates.by.groups, aes(x=name, y=rate, colour=majr, group=majr) )

inside a function. But I would like to be able to use a parameter of the function to pick out the column to use as colour and group. I.e. I would like something like this

f <- function( column ) {
    ...
    ggplot( rates.by.groups, aes(x=name, y=rate, colour= ??? , group=??? ) )
}

So that the column used in the ggplot is determined by the parameter. E.g. for f("majr") we get the effect of

ggplot( rates.by.groups, aes(x=name, y=rate, colour=majr, group=majr) )

but for f("gender") we get the effect of

  ggplot( rates.by.groups, aes(x=name, y=rate, colour=gender, group=gender) )

Some things I tried:

ggplot( rates.by.groups, aes(x=name, y=rate, colour= columnName , group=columnName ) )

did not work. Nor did

e <- environment() 
ggplot( rates.by.groups, aes(x=name, y=rate, colour= columnName , group=columnName ), environment=e )
Walloping answered 10/3, 2014 at 19:17 Comment(0)
V
243

Note: the solution in this answer is "soft-deprecated". See the answer below using .data[[ for the currently preferred method.

You can use aes_string:

f <- function( column ) {
    ...
    ggplot( rates.by.groups, aes_string(x="name", y="rate", colour= column,
                                        group=column ) )
}

as long as you pass the column to the function as a string (f("majr") rather than f(majr) ). Also note that we changed the other columns, "name" and "rate", to be strings.

If for whatever reason you'd rather not use aes_string, you could change it to (the somewhat more cumbersome):

    ggplot( rates.by.groups, aes(x=name, y=rate, colour= get(column),
                                        group=get(column) ) )
Vandalism answered 10/3, 2014 at 19:20 Comment(5)
It's worth saying that you shouldn't/can't do aes_string(x = rates.by.groups$name..., and anyway you don't need to since you already passed the ggplot(data = rates.by.groups... argument. (The issue in this question)Mccall
Just adding a note to point people down to Moody_Mudskipper's answer with updates for ggplot2 version 3.0.0Ye
@buncis That's not true, quoting "column_name" or "column" wouldn't workVandalism
@DavidRobinson sorry my mistake, I don't see the code is wrapped on a function with parameter, gonna delete my commentComplete
"cumbersome"? Non-standard evaluation in R is ironically the most cumbersome "feature" I have ever encountered in a programming language. Truly maddening.Drag
D
109

From the release notes of ggplot2 V3.0.0 :

aes() now supports quasiquotation so that you can use !!, !!!, and :=. This replaces aes_() and aes_string() which are now soft-deprecated (but will remain around for a long time).

The idiomatic way now would be to convert to a symbol the string that the variable contains, using sym()(which is almost the same as base aliases as.name() / as.symbol()), and unquote it using !!

Simulating OP's data we can do :

library(tidyverse)
rates.by.groups <- data.frame(
  name = LETTERS[1:3],
  rate = 1:3,
  mjr = LETTERS[c(4,4,5)],
  gender = c("M","F","F")
)

f <- function(column) {
  column <- sym(column)
  ggplot(rates.by.groups, 
         aes(x = name, 
             y = rate, 
             fill  = !!column, 
             group = !!column)) +
    geom_col()
}

f("gender")
f("mjr")
x <- "gender"
f(x)

If we'd rather feed raw names to the function we can do:

f2 <- function(column) {
  column <- ensym(column)
  ggplot(rates.by.groups, 
         aes(x = name, 
             y = rate, 
             fill  = !!column, 
             group = !!column)) +
    geom_col()
}

It will work with names a.k.a. symbols AND with string literals

f2(gender)
f2(mjr)
f2("gender")
f2("mjr")

As Lionel says about ensym():

it's meant to mimic the syntax of arguments where you can supply both in the LHS, e.g. list(bare = 1, "quoted" = 2)


A note on enquo()

enquo()quotes the expression (not necessarily a symbol) fed to the argument, it doesn't convert a string literal to a symbol as ensym() does so it might be less adapted here, but we can do :

f3 <- function(column) {
  column <- enquo(column)
  ggplot(rates.by.groups, 
         aes(x = name, 
             y = rate, 
             fill  = !!column, 
             group = !!column)) +
    geom_col()
}

f3(gender)
f2(mjr)
Dowitcher answered 6/11, 2018 at 8:57 Comment(5)
This tidyeval stuff is so annoying. The documentation for aes() itself talks about enquo() but it doesn't work. And whoever heard of ensym() before? BIG SIGHPortland
@Moody_Mudskipper For f2, all four examples work, and so does capturing the column name in a variable (i.e. aname <- "mjr"; f2(aname)). If I add code to manipulate the data frame using dplyr it attempts to find a column using the variable name and not the string in the variable name. In other words, how do I get rates.by.groups %>% group_by(!!column)... to work and still support the three ways of calling f2 ?Cop
"so does capturing the column name in a variable" : it doesn't fail but it doesn't return the same result, ensym is designed to deal with arguments provided as names, and tolerate quotes around them. I believe you would like to treat the argument as a name, and to fall back on the value if the name is not found. This is actually what happens with select, but not with group_by ... It's possible to hack around it but not obvious. If it's important to you I think it would deserve its own question.Dowitcher
@Moody_Mudskipper Thanks. I was using both select and group_by so that was likely the issue. I can create a new question, but I need to come up with a simple example and check to see if it has been answered. I can post it if not.Cop
How to use !! in case of facet_grid? It works with facet_grid(cols = vars(!!column)) but throws an error with facet_grid(~ !!column)Pepi
F
70

Another option (ggplot2 > 3.0.0) is to use the tidy evaluation pronoun .data to slice the chosen variable/column from the rates.by.groups data frame.

See also this answer

library(ggplot2)
theme_set(theme_classic(base_size = 14))

# created by @Moody_Mudskipper
rates.by.groups <- data.frame(
  name = LETTERS[1:3],
  rate = 1:3,
  mjr = LETTERS[c(4, 4, 5)],
  gender = c("M", "F", "F")
)

f1 <- function(df, column) {
  gg <- ggplot(df, 
         aes(x = name, 
             y = rate, 
             fill  = .data[[column]], 
             group = .data[[column]])) +
    geom_col() +
    labs(fill = column)
  return(gg)
}

plot_list <- lapply(list("gender", "mjr"), function(x){ f1(rates.by.groups, x) })
plot_list
#> [[1]]

#> 
#> [[2]]

# combine all plots
library(egg)
ggarrange(plots = plot_list,
          nrow = 2,
          labels = c('A)', 'B)'))

Created on 2019-04-04 by the reprex package (v0.2.1.9000)

Fairway answered 4/4, 2019 at 20:7 Comment(3)
The nicest thing about the .data[[ ]] approach is its generality. Thanks.Teliospore
I believe this is the canonical solution since rlang 0.4.* was introduced. This is also how it is proposed in the official vignette to ggplot2: ggplot2.tidyverse.org/articles/ggplot2-in-packages.htmlAntilogism
this actually works the best!Engagement
M
18

Try using aes_string instead of aes.

Mastiff answered 10/3, 2014 at 19:20 Comment(2)
This is great advice but can you tell them why? aes_string makes you use "" for non-variables and you use variables unquotes. aes_string(x = "foo", y = "fee", group = variable)Tableland
@Tableland maybe because the variable have string as its valueComplete
G
18

Do two things

  1. Turn the column name into a symbol with sym()
  2. Prepend !! to the symbol when you want to use it

Example

my_col <- sym("Petal.Length")

iris %>% 
  ggplot(aes(x = Sepal.Length, y = !!my_col)) +
  geom_point()
Glomeration answered 4/8, 2020 at 4:34 Comment(0)
H
2

Using aes_string does fix this problem, but does face an issue when adding error bars geom_errorbar. Below is a simple solution.

#Identify your variables using the names of your columns indie your dataset
 xaxis   <- "Independent"   
 yaxis   <- "Dependent"
 sd      <- "error"

#Specify error bar range (in 'a-b' not 'a'-'b')
 range   <- c(yaxis, sd)                                #using c(X, y) allows use of quotation marks inside formula
 yerrbar <- aes_string(ymin=paste(range, collapse='-'), 
                       ymax=paste(range, collapse='+'))


#Build the plot
  ggplot(data=Dataset, aes_string(x=xaxis, y=yaxis)) +
    geom_errorbar(mapping=yerrbar, width=15, colour="#73777a", size = 0.5) +
    geom_point   (shape=21)

Bonus, you can also add facets to your plot using these lines inside the ggplot:

facet_grid(formula(paste(Variable1, "~", Variable2)))

This script was modified from this original post: ggplot2 - Error bars using a custom function

Hyperbolize answered 9/3, 2020 at 20:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.