As explained here when the test condition in ifelse(test, yes, no)
is NA
, the evaluation is also NA
. Hence the following returns...
df <- data.frame(a = c(1, 1, NA, NA, NA ,NA),
b = c(NA, NA, 1, 1, NA, NA),
c = c(rep(NA, 4), 1, 1))
ifelse(df$a==1, "a==1",
ifelse(df$b==1, "b==1",
ifelse(df$c==1, "c==1", NA)))
#[1] "a==1" "a==1" NA NA NA NA
... instead of the desired
#[1] "a==1" "a==1" "b==1" "b==1" "c==1" "c==1"
As suggested by Cath, I can circumvent this problem by formally specifying that the test condition should not include NA:
ifelse(df$a==1 & !is.na(df$a), "a==1",
ifelse(df$b==1 & !is.na(df$b), "b==1",
ifelse(df$c==1 & !is.na(df$c), "c==1", NA)))
However, as akrun also noted, this solution becomes rather lengthy with increasing number of columns.
A workaround would be to first replace all NA
s with a value not present in the data.frame (e.g, 2 in this case):
df_noNA <- data.frame(a = c(1, 1, 2, 2, 2 ,2),
b = c(2, 2, 1, 1, 2, 2),
c = c(rep(2, 4), 1, 1))
ifelse(df_noNA$a==1, "a==1",
ifelse(df_noNA$b==1, "b==1",
ifelse(df_noNA$c==1, "c==1", NA)))
#[1] "a==1" "a==1" "b==1" "b==1" "c==1" "c==1"
However, I was wondering if there was a more direct way to tell ifelse
to ignore NAs? Or is writing a function for & !is.na
the most direct way?
ignorena <- function(column) {
column ==1 & !is.na(column)
}
ifelse(ignorena(df$a), "a==1",
ifelse(ignorena(df$b), "b==1",
ifelse(ignorena(df$c), "c==1", NA)))
#[1] "a==1" "a==1" "b==1" "b==1" "c==1" "c==1"