Dynamic name with glue in mutate call
Asked Answered
C

3

10

I want to create a function that takes up as the first argument the name of a data set, and as a second argument, part of a column's name from the dataframe. I then want to use glue to dynamically construct the column name, in the function, and use that column in a mutate call, like so:

library(tidyverse)

tmp <- 
  function(data, type){
  var <- glue::glue("Sepal.{type}")
  iris |> 
    select({{var}}) |> 
    mutate("{var}" := mean({{var}}))
}

I've tried a lot of things, but I struggle to find a solution where the column is called both for the name of the new column (here, "{var}") and for the computation of the new column (here, mean({{var}})). What should one do in such cases?

Here, calling tmp(iris, "Length") should return a 150x1 data.frame with the mean value in all rows.

tidyverse solution are preferred, or any pipe-based answers.

Corporeal answered 29/5 at 8:8 Comment(3)
How about using the .data pro-noun, i.e. mutate("{var}" := mean(.data[[var]]))Dobb
Both work! I'm curious why {{}} does not work directly here, despite being sometimes used as examples in the documentation: dplyr.tidyverse.org/articles/programming.htmlStucco
@Corporeal Because var refers to a character string, not a symbol. In some cases this seems to work with {{…}} but not in all. Which honestly is pretty confusing and I’d wish that {{…}} was more strict/consistent in what it accepted.Steinbok
T
10

You can use mean({{var}}) if you modify your code just a little bit, for example, using as.symbol (or as.name) to define var, instead of a glue char

tmp <- function(data, type) {
    var <- as.symbol(glue::glue("Sepal.{type}"))
    data |>
        select(var) |>
        mutate("{var}" := mean({{ var }}))
}

For some alternatives, I guess you can try get(var) or !!rlang::syms(var), for example

tmp <- function(data, type) {
    var <- glue::glue("Sepal.{type}")
    data |>
        select({{ var }}) |>
        mutate("{var}" := mean(get(var)))
}

or

tmp <- function(data, type) {
    var <- rlang::sym(glue::glue("Sepal.{type}"))
    data |>
        select(var) |>
        mutate("{var}" := mean(!!var))
}
Treasonable answered 29/5 at 8:30 Comment(1)
This, coupled with Konrad's explanation, sorted things out!Stucco
K
5

Here is a pipe attempt using base, pity I can't find a way of using dynamic variable name within transform.

tmpBase <- function(data, type){
  var <- paste0( "Sepal.", type)
  data |>
  transform(x = mean(get(var))) |>
    subset(select = x) |>
    setNames(var)
  }

head(tmpBase(iris, "Length"))
#   Sepal.Length
# 1     5.843333
# 2     5.843333
# 3     5.843333
# 4     5.843333
# 5     5.843333
# 6     5.843333
Kassi answered 29/5 at 9:48 Comment(1)
"a way of using dynamic variable name within transform" -- I suspect this cannot be done in base. Same for within(), with(), ...Faltboat
A
2

Is this what you are looking for?

tmp <- 
  function(data, type){
  var <- glue::glue("Sepal.{type}")
  data |> 
    select(all_of(var)) |> 
    mutate(across(everything(), mean))
}

To select() variables using a character vector, just use all_of(). And if you are selecting only the variable to be transformed, then you can use mutate(across(everything(), f) to avoid specifying the variable name.

Afrikander answered 29/5 at 8:38 Comment(1)
While this works in this example, I want to state the variable name explicitly in the mutate callStucco

© 2022 - 2024 — McMap. All rights reserved.