If I try to create a new column in an R dataframe by adding 3 boolean expressions in one step, it results in a boolean rather than an integer. If I use an intermediate step to first create columns for the 3 boolean expressions, I can add them up and get an integer. I don't understand why the two sets of code produce different results.
#The input is a dataframe with 3 variables that are sometimes missing
#and sometimes not.
subjid <- c(101,102,103,104,105,106,107,108)
var1 <- c(1,2,3,4,NaN,NaN,NaN,NaN)
var2 <- c(1,2,NaN,NaN,5,6,NaN,NaN)
var3 <- c(1,NaN,3,NaN,5,NaN,7,NaN)
df <- data.frame(subjid, var1, var2, var3)
df
subjid var1 var2 var3
1 101 1 1 1
2 102 2 2 NaN
3 103 3 NaN 3
4 104 4 NaN NaN
5 105 NaN 5 5
6 106 NaN 6 NaN
7 107 NaN NaN 7
8 108 NaN NaN NaN
#This code was intended to count how many of the 3 variables were nonmissing
#But it produces an unexpected result
df$nonmissing_count_a <- !is.na(df$var1) + !is.na(df$var2) + !is.na(df$var3)
table(df$nonmissing_count_a)
FALSE TRUE
5 3
#This code is intended to obtain the same count of nonmissing variables
#And it works as expected
df$var1_nonmissing <- !is.na(df$var1)
df$var2_nonmissing <- !is.na(df$var2)
df$var3_nonmissing <- !is.na(df$var3)
df$nonmissing_count_b <- df$var1_nonmissing + df$var2_nonmissing + df$var3_nonmissing
table(df$nonmissing_count_b)
0 1 2 3
1 3 3 1