Using parse_expr(), quo_name(), and enquo() to define a character object for plotting country-wise graphs in ggplot
Asked Answered
C

3

5

I have a function from a source that uses a couple of inputs, including country name, and return a graph for that country. The first line of the function defines a Country_name object as something that I cannot understand. When I tried to pull out that part from the function and run it separately, it returns an error while it works fine inside the function. Anyone has the opinion why this happened and what is the purpose of that line of code for Country_name?

function(df, dfline, Country_name){
  Country_name <- rlang::parse_expr(quo_name(enquo(Country_name)))
  df %>%
    filter(Country == Country_name ...
}

Pull out the first line and run it separately returns an error:

parse_expr(quo_name(enquo('United States')))

### Error in `enquo()`:
### ! `arg` must be a symbol
Camm answered 29/5, 2023 at 2:51 Comment(2)
More information about that function may be of help here. From what we have now, I can only tell that your usage of parse_expr(quo_name(enquo('string'))) is wrong. The correct way of using enquo() should be like foo(df, Country_name = `United States`), and there should be a United states column in that df object.Medora
The purpose of the line is to let you pass the country name as a symbol rather than a character string — eg, you could pass Canada instead of "Canada". However, this is in my opinion not great practice as Country_name is a character column, so it would make more sense for the function to accept that arg as a character than a symbol. Also `parse_expr()‘ isn’t necessary.Hitandmiss
V
2

First let's build a minimal reprex

library(dplyr, warn.conflicts = FALSE)
fun <- function(df, Country_name){
  Country_name <- rlang::parse_expr(quo_name(enquo(Country_name)))
  df %>%
    filter(Country == Country_name)
}
df <- data.frame(x = 1:2, Country = c("Belgium", "Ukraine"))
df
#>   x Country
#> 1 1 Belgium
#> 2 2 Ukraine

fun(df, Ukraine)
#>   x Country
#> 1 2 Ukraine

Then let's use boomer to print intermediate outputs :

fun1 <- boomer::rig(fun)
fun1(df, Ukraine)
#> 👇 fun
#> 💣 rlang::parse_expr(quo_name(enquo(Country_name))) 
#> · 💣 quo_name(enquo(Country_name)) 
#> · · 💣 💥 enquo(Country_name) 
#> · · <quosure>
#> · · expr: ^Ukraine
#> · · env:  global
#> · · 
#> · 💥 quo_name(enquo(Country_name)) 
#> · [1] "Ukraine"
#> · 
#> 💥 rlang::parse_expr(quo_name(enquo(Country_name))) 
#> Ukraine
#> 
#> 💣 df %>% filter(Country == Country_name) 
#> · 💣 filter(., Country == Country_name) 
#> · · df :
#> · ·   x Country
#> · · 1 1 Belgium
#> · · 2 2 Ukraine
#> · · Country_name :
#> · · Ukraine
#> · · 💣 💥 Country == Country_name 
#> · · [1] FALSE  TRUE
#> · · 
#> · 💥 filter(., Country == Country_name) 
#> ·   x Country
#> · 1 2 Ukraine
#> · 
#> 💥 df %>% filter(Country == Country_name) 
#>   x Country
#> 1 2 Ukraine
#> 
#> 👆 fun
#>   x Country
#> 1 2 Ukraine

We see that:

  • enquo() captures the input into a quosure
  • quo_name() extract the expression as a string
  • parse_expr() build a symbol from the string
  • This symbol is used in the equality (there's it's coerced to character, try quote(a) == "a" to check how this works).

If we want to understand the objects better we might use {constructive} in the print argument. Instead of printing the objects it will print the code to reconstruct them.

# remotes::install_github("cynkra/constructive")
fun2 <- boomer::rig(fun, print = constructive::construct)
fun2(df, Ukraine)
#> 👇 fun
#> 💣 rlang::parse_expr(quo_name(enquo(Country_name))) 
#> · 💣 quo_name(enquo(Country_name)) 
#> · · 💣 💥 enquo(Country_name) 
#> · · rlang::new_quosure(quote(Ukraine), .GlobalEnv)
#> · · 
#> · 💥 quo_name(enquo(Country_name)) 
#> · "Ukraine"
#> · 
#> 💥 rlang::parse_expr(quo_name(enquo(Country_name))) 
#> quote(Ukraine)
#> 
#> 💣 df %>% filter(Country == Country_name) 
#> · 💣 filter(., Country == Country_name) 
#> · · df :
#> · · data.frame(x = 1:2, Country = c("Belgium", "Ukraine"))
#> · · Country_name :
#> · · quote(Ukraine)
#> · · 💣 💥 Country == Country_name 
#> · · c(FALSE, TRUE)
#> · · 
#> · 💥 filter(., Country == Country_name) 
#> · data.frame(x = 2L, Country = "Ukraine")
#> · 
#> 💥 df %>% filter(Country == Country_name) 
#> data.frame(x = 2L, Country = "Ukraine")
#> 
#> 👆 fun
#>   x Country
#> 1 2 Ukraine

boomer::boom(fun(df, Ukraine), print = function(x) print(constructive::construct(x)))
#> 💣 💥 fun(df, Ukraine) 
#> data.frame(x = 2L, Country = "Ukraine")
#>   x Country
#> 1 2 Ukraine

Created on 2023-06-02 with reprex v2.0.2

The bottom line is that the code is bloated, and also weird and unsafe, you shouldn't provide strings as variables just to spare double quotes, how will you provide "United Kingdom" ?

The right way to do it is simply to provide Country_name as a string and have :

fun <- function(df, Country_name){
  df %>%
    filter(Country == Country_name)
}

Or to be extra safe, in case df could contain a Country_name column that would collide with the argument:

fun <- function(df, Country_name){
  df %>%
    filter(Country == .env$Country_name)
}

or

fun <- function(df, Country_name){
  df %>%
    filter(Country == !!Country_name)
}
Vitebsk answered 2/6, 2023 at 22:2 Comment(0)
T
4

Assume this was your dataset:

df1 <- tribble(~ Country, ~ Value,
               'Brazil', 1,
               'Brazil', 2,
               'Canada', 3,
               'Canada', 4)

> df1 
# A tibble: 4 × 2
  Country Value
  <chr>   <dbl>
1 Brazil      1
2 Brazil      2
3 Canada      3
4 Canada      4

You could write your custom filter function simply as:

fun1 <- function(df, Country_name){
  df %>%
    filter(Country == Country_name)
}

> fun1(df1, 'Brazil')
# A tibble: 2 × 2
  Country Value
  <chr>   <dbl>
1 Brazil      1
2 Brazil      2

But imagine you want to be able to omit the quotes around 'Brazil' and still get the same output. If you made no modification you would get an error:

> fun1(df1, Brazil)
# ...
#! object 'Brazil' not found
# ...

R is understanding Brazil as a variable, and is looking for it in your global environment. It is failing to find it, and then, it returns an error. If Brazil were a variable, you could get weird results:

Brazil <- 'Cadada'

> fun2(df1, Brazil)
# A tibble: 2 × 2
  Country Value
  <chr>   <dbl>
1 Canada      3
2 Canada      4

R is seeing that Brazil has the value of 'Canada', binding that value to Country_name, and using that value on the filter.

That's not what you wanted. You wanted to get the actual word Brazil, and not the value it represents. That is what the line you were referring to does. I'll explain how it works below.

The first step is saying to R "I don't want you to evaluate the argument you received, I just want you to save it's text". That is, we want to delay the evaluation of the expression that was passed onto Country_name. That can be done in several ways:

  • substitute(Country_name) in base R, as Nir Graham noted;

substitute returns the parse tree for the (unevaluated) expression ... -substitute's help page.

  • enquo(Country_name) with rlang, as your function did.

enquo() and enquos() defuse function arguments. A defused expression can be examined, modified, and injected into other expressions. -enquo's help page.

  • enexpr(Country_name) with rlang, also as Nir Graham noted;

enexpr() and enexprs() are like enquo() and enquos() but return naked expressions instead of quosures. -enexpr's help page

So they all have very similar effects. The biggest difference is that enquo "return quosures instead of naked expressions". In simple terms, quosures are expressions that also point to the environment where the value for it's relevant variables should be found*. We don't need that (but it's also not a problem), as the expression in question wont be evaluated, we just want it's text.

After that we just want to get the text of that defused expression, which can be made with:

  • as.character();
  • deparse1();
  • rlang::quo_name();
  • rlang::expr_name().

And others. Thus, the options are similar to what Nir Graham did:

fun2_base <- function(df, Country_name){
  Country_name <- deparse1(substitute(Country_name))
  df %>%
    filter(Country == Country_name)
}

fun2_rlang <- function(df, Country_name){
  Country_name <- as.character(enexpr(Country_name))
  df %>%
    filter(Country == Country_name)
}

fun2_base(df1, Brazil)
fun2_rlang(df1, Brazil)

All yield:

# A tibble: 2 × 2
  Country Value
  <chr>   <dbl>
1 Brazil      1
2 Brazil      2

Note that we didn't needed to remove that Brazil variable, because it's not being evaluated.

*: To know more, read about tidy evaluation and the metaprogramming chapters of "Advanced R"

Tanga answered 31/5, 2023 at 14:3 Comment(0)
V
2

First let's build a minimal reprex

library(dplyr, warn.conflicts = FALSE)
fun <- function(df, Country_name){
  Country_name <- rlang::parse_expr(quo_name(enquo(Country_name)))
  df %>%
    filter(Country == Country_name)
}
df <- data.frame(x = 1:2, Country = c("Belgium", "Ukraine"))
df
#>   x Country
#> 1 1 Belgium
#> 2 2 Ukraine

fun(df, Ukraine)
#>   x Country
#> 1 2 Ukraine

Then let's use boomer to print intermediate outputs :

fun1 <- boomer::rig(fun)
fun1(df, Ukraine)
#> 👇 fun
#> 💣 rlang::parse_expr(quo_name(enquo(Country_name))) 
#> · 💣 quo_name(enquo(Country_name)) 
#> · · 💣 💥 enquo(Country_name) 
#> · · <quosure>
#> · · expr: ^Ukraine
#> · · env:  global
#> · · 
#> · 💥 quo_name(enquo(Country_name)) 
#> · [1] "Ukraine"
#> · 
#> 💥 rlang::parse_expr(quo_name(enquo(Country_name))) 
#> Ukraine
#> 
#> 💣 df %>% filter(Country == Country_name) 
#> · 💣 filter(., Country == Country_name) 
#> · · df :
#> · ·   x Country
#> · · 1 1 Belgium
#> · · 2 2 Ukraine
#> · · Country_name :
#> · · Ukraine
#> · · 💣 💥 Country == Country_name 
#> · · [1] FALSE  TRUE
#> · · 
#> · 💥 filter(., Country == Country_name) 
#> ·   x Country
#> · 1 2 Ukraine
#> · 
#> 💥 df %>% filter(Country == Country_name) 
#>   x Country
#> 1 2 Ukraine
#> 
#> 👆 fun
#>   x Country
#> 1 2 Ukraine

We see that:

  • enquo() captures the input into a quosure
  • quo_name() extract the expression as a string
  • parse_expr() build a symbol from the string
  • This symbol is used in the equality (there's it's coerced to character, try quote(a) == "a" to check how this works).

If we want to understand the objects better we might use {constructive} in the print argument. Instead of printing the objects it will print the code to reconstruct them.

# remotes::install_github("cynkra/constructive")
fun2 <- boomer::rig(fun, print = constructive::construct)
fun2(df, Ukraine)
#> 👇 fun
#> 💣 rlang::parse_expr(quo_name(enquo(Country_name))) 
#> · 💣 quo_name(enquo(Country_name)) 
#> · · 💣 💥 enquo(Country_name) 
#> · · rlang::new_quosure(quote(Ukraine), .GlobalEnv)
#> · · 
#> · 💥 quo_name(enquo(Country_name)) 
#> · "Ukraine"
#> · 
#> 💥 rlang::parse_expr(quo_name(enquo(Country_name))) 
#> quote(Ukraine)
#> 
#> 💣 df %>% filter(Country == Country_name) 
#> · 💣 filter(., Country == Country_name) 
#> · · df :
#> · · data.frame(x = 1:2, Country = c("Belgium", "Ukraine"))
#> · · Country_name :
#> · · quote(Ukraine)
#> · · 💣 💥 Country == Country_name 
#> · · c(FALSE, TRUE)
#> · · 
#> · 💥 filter(., Country == Country_name) 
#> · data.frame(x = 2L, Country = "Ukraine")
#> · 
#> 💥 df %>% filter(Country == Country_name) 
#> data.frame(x = 2L, Country = "Ukraine")
#> 
#> 👆 fun
#>   x Country
#> 1 2 Ukraine

boomer::boom(fun(df, Ukraine), print = function(x) print(constructive::construct(x)))
#> 💣 💥 fun(df, Ukraine) 
#> data.frame(x = 2L, Country = "Ukraine")
#>   x Country
#> 1 2 Ukraine

Created on 2023-06-02 with reprex v2.0.2

The bottom line is that the code is bloated, and also weird and unsafe, you shouldn't provide strings as variables just to spare double quotes, how will you provide "United Kingdom" ?

The right way to do it is simply to provide Country_name as a string and have :

fun <- function(df, Country_name){
  df %>%
    filter(Country == Country_name)
}

Or to be extra safe, in case df could contain a Country_name column that would collide with the argument:

fun <- function(df, Country_name){
  df %>%
    filter(Country == .env$Country_name)
}

or

fun <- function(df, Country_name){
  df %>%
    filter(Country == !!Country_name)
}
Vitebsk answered 2/6, 2023 at 22:2 Comment(0)
B
0

Its using 3 function calls to be able to do what is acheivable in 2, whether in base or using rlang.

library(dplyr)
library(rlang)

myfilt_base <- function(x){
  mysym <- deparse1(substitute(x))
  filter(iris, Species == mysym)
}

myfilt_base(versicolor)

myfilt_rlang <- function(x){
  mysym <- as.character(enexpr(x))
  filter(iris, Species == mysym)
}

myfilt_rlang(virginica)
Bleeding answered 31/5, 2023 at 9:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.