Unexpected return for NA in factor lookup
Asked Answered
Y

2

5

I have a factor that I'm using as a lookup table.

condLookup = c(hotdog = "ketchup", ham = "mustard", popcorn = "salt", coffee = "cream")

This works as expected - I put in a 3-vector and get a 3-vector back:

condLookup[c("hotdog", "spinach", NA)]
  hotdog      <NA>      <NA> 
"ketchup"       NA        NA 

This too is expected, even tho the returns are all NA:

condLookup[c(NA, "spinach")]
<NA> <NA> 
  NA   NA 

And this:

condLookup["spinach"]
<NA> 
  NA 

But then this surprised me - I gave an atomic NA, or two NA, and I got a named vector of 4 NA's back.

condLookup[NA]
<NA> <NA> <NA> <NA> 
  NA   NA   NA   NA 
condLookup[c(NA, NA)]
<NA> <NA> <NA> <NA> 
  NA   NA   NA   NA 

Apparently, for vector2 <- condLookup[vector1] then vector2 will be the same length as vector1 unless every element in vector1 is NA. In which case vector2 is the same length as condLookup. Can you explain this behavior?

Ytterbium answered 21/6, 2020 at 19:36 Comment(3)
you reminded me of a recent answer of mine :)Gnomon
It took me a minute, but I finally got the pun. :-)Ytterbium
Related post, where NA is used to index integer vector, with similar recycling: Indexing integer vector with NAEuxenite
B
6

NA values are typed, and the type matters: c(NA,"spinach") coerces NA to character, which isn't recycled:

condLookup[NA]
## <NA> <NA> <NA> <NA> 
##   NA   NA   NA   NA 

condLookup[NA_character_]
## <NA> 
##  NA

The default type of NA is logical. Logical vectors will get recycled to match the length of the vector, while character vectors will be used to match the names of the vector. From ?[:

Character vectors will be matched to the ‘names’ of the object

... ‘i’, ‘j’, ‘...’ can be logical vectors, indicating elements/slices to select. Such vectors are recycled if necessary to match the corresponding extent.

Bellbird answered 21/6, 2020 at 19:45 Comment(1)
this also nicely explains why a[ c( NA, NA, NA, NA, NA, NA )] will yield 6 NA's, not 4 :)Gnomon
P
1

In addition to @Ben's answer--recycling, ?Extract displays the following statement:

  • Neither empty ("") nor NA indices match any names, not even empty nor missing names. If any object has no names or appropriate dimnames, they are taken as all "" and so match nothing.

Priebe answered 21/6, 2020 at 19:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.