How to recode a variable to numeric?
Asked Answered
T

3

4
> library(car)

> df = data.frame(value=c('A', 'B', 'C', 'A'))
> foo = recode(df$value, "'A'=1; 'B'=2; 'C'=3;", as.numeric.result=TRUE)
> mean(foo)
[1] NA
Warning message:
In mean.default(foo) : argument is not numeric or logical: returning NA
> foo
[1] 1 2 3 1
Levels: 1 2 3

Ugh. I thought the definition of as.numeric.result (default TRUE) was that if the results are all numerals, they would be coerced to numeric.

How do I get the results of this recoding to be numeric?

Tilburg answered 14/7, 2011 at 22:32 Comment(0)
A
6

If you look carefully at the documentation on recode you'll see this:

as.factor.result     return a factor; default is TRUE if var is a factor, FALSE otherwise.
as.numeric.result    if TRUE (the default), and as.factor.result is FALSE, 
                      then the result will be coerced to numeric if all values in the 
                      result are numerals—i.e., represent numbers.

So you need to specify as.factor.result=FALSE I think:

foo = recode(df$value, "'A'=1; 'B'=2; 'C'=3;", as.factor.result=FALSE)

edit Since the default of as.numeric.result is TRUE, you only need to specify as.factor.result=FALSE, rather than specifying both of them.

Aruwimi answered 14/7, 2011 at 22:47 Comment(1)
I have a similar question, but I have been working on it for more than one hour, searching on the websites. My question is What if the value is ' >50K' and ' <=50K'. In his case, just like foo = recode(df$value, "' >A'=1; ' >B'=2; ' >C'=3;"), please note > or < mark. I am struggling with these two marks. I cannot post another question, because it will be considered as duplicated.Frere
B
3

Try using as.numeric again

> bar <- as.numeric(foo)
> bar
[1] 1 2 3 1
> str(bar)
 num [1:4] 1 2 3 1
Bullhead answered 14/7, 2011 at 22:43 Comment(0)
S
3

From ?recode you should note what is said about the as.numeric.result argument:

as.factor.result: return a factor; default is ‘TRUE’ if ‘var’ is a
          factor, ‘FALSE’ otherwise.

as.numeric.result: if ‘TRUE’ (the default), and ‘as.factor.result’ is
          ‘FALSE’, then the result will be coerced to numeric if all
          values in the result are numerals-i.e., represent numbers.

as.factor.result defaults to TRUE so the result will always be a factor, regardless of what you set as.numeric.result to. To get the desired behaviour, set both as.factor.result = FALSE and as.numeric.result = TRUE:

> recode(df$value, "'A'=1; 'B'=2; 'C'=3;", as.numeric.result=TRUE, 
         as.factor.result = FALSE)
[1] 1 2 3 1
Speos answered 14/7, 2011 at 22:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.