Is there an alternative to "revalue" function from plyr when using dplyr?
Asked Answered
W

5

25

I'm a fan of the revalue function is plyr for substituting strings. It's simple and easy to remember.

However, I've migrated new code to dplyr which doesn't appear to have a revalue function. What is the accepted idiom in dplyr for doing things previously done with revalue?

Wigfall answered 14/4, 2016 at 6:53 Comment(3)
Can you show some reproducible exampleCaribbean
library(plyr); library(dplyr)?Insignia
someone having same thoughts here twitter.com/jennybryan/status/524607056696057856Ossicle
O
30

There is a recode function available starting with dplyr version dplyr_0.5.0 which looks very similar to revalue from plyr.

Example built from the recode documentation Examples section:

set.seed(16)
x = sample(c("a", "b", "c"), 10, replace = TRUE)
x
 [1] "a" "b" "a" "b" "b" "a" "c" "c" "c" "a"

recode(x, a = "Apple", b = "Bear", c = "Car")

   [1] "Car"   "Apple" "Bear"  "Apple" "Car"   "Apple" "Apple" "Car"   "Car"   "Apple"

If you only define some of the values that you want to recode, by default the rest are filled with NA.

recode(x, a = "Apple", c = "Car")
 [1] "Car"   "Apple" NA      "Apple" "Car"   "Apple" "Apple" "Car"   "Car"   "Apple"

This behavior can be changed using the .default argument.

recode(x, a = "Apple", c = "Car", .default = x)
 [1] "Car"   "Apple" "b"     "Apple" "Car"   "Apple" "Apple" "Car"   "Car"   "Apple"

There is also a .missing argument if you want to replace missing values with something else.

Ocean answered 14/4, 2016 at 15:17 Comment(1)
Apparently this is the Hadley endorsed answer: See twitter.com/hadleywickham/status/524614991719067648 and github.com/hadley/dplyr/issues/631Wigfall
C
5

We can do this with chartr from base R

chartr("ac", "AC", x)

data

x <- c("a", "b", "c")
Caribbean answered 14/4, 2016 at 6:56 Comment(2)
I love how you always post a base R solution; wherever I go, you are there @Caribbean +1. Additionally, how would you manage to do it with two long vectors? let's say as in the example hereOversight
@ÁlvaroA.Gutiérrez-Vargas thank you. I looked at the link you showed. It is showing a single vector. Any chance the link is differentCaribbean
I
3

I wanted to comment on the answer by @aosmith, but lack reputation. It seems that nowadays the default of dplyr's recode function is to leave unspecified levels unaffected.

x = sample(c("a", "b", "c"), 10, replace = TRUE)
x
[1] "c" "c" "b" "b" "a" "b" "c" "c" "c" "b"

recode(x , a = "apple", b = "banana" )

[1] "c"      "c"      "banana" "banana" "apple"  "banana" "c"      "c"      "c"      "banana"

To change all nonspecified levels to NA, the argument .default = NA_character_ should be included.

recode(x, a = "apple", b = "banana", .default = NA_character_)

[1] "apple"  "banana" "apple"  "banana" "banana" "apple"  NA       NA       NA       "apple" 
Iceboat answered 19/7, 2018 at 10:41 Comment(0)
S
0

One alternative that I find handy is the mapvalues function for the data.tables e.g

df[, variable := mapvalues(variable, old = old_names_string_vector, new = new_names_string_vector)]
Squatness answered 6/2, 2018 at 11:43 Comment(0)
V
0

R base solution

You can use ifelse() from base for this. The functions arguments are ifelse(test, yes, no). Here an example:

(x <- sample(c("a", "b", "c"), 5, replace = TRUE))
[1] "c" "a" "b" "a" "a"

ifelse(x == "a", "Apple", x)
[1] "c"     "Apple" "b"     "Apple" "Apple"

If you want to recode multiple values you can use the function in a nested way like this:

ifelse(x == "a", "Apple", ifelse(x == "b", "Banana", x))
[1] "c"      "Apple"  "Banana" "Apple"  "Apple"

Own function

Having many values that must be recoded can make coding with ifelse() messy. Therefor, Ihere is an own function:

my_revalue <- function(x, ...){
  reval <- list(...)

  from <- names(reval)
  to <- unlist(reval)

  out <- eval(parse(text= paste0("{", paste0(paste0("x[x ==", "'", from,"'", "]", "<-", "'", to, "'"), collapse= ";"), ";x", "}")))

  return(out)
}

Now we can change multiple values quite fast:

my_revalue(vec= x, "a" = "Apple", "b" = "Banana", "c" = "Cranberry")
[1] "Cranberry" "Apple"     "Banana"      "Apple"     "Apple"  
Vouchsafe answered 27/10, 2021 at 13:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.