assign to is.na(clinical.trial$age)
Asked Answered
O

3

8

I am looking at the code from here which has this at the beginning:

## generate data for medical example 
clinical.trial <-
    data.frame(patient = 1:100,
               age = rnorm(100, mean = 60, sd = 6),
               treatment = gl(2, 50,
                 labels = c("Treatment", "Control")),
               center = sample(paste("Center", LETTERS[1:5]), 100, replace = 
TRUE))

## set some ages to NA (missing) 
is.na(clinical.trial$age) <- sample(1:100, 20)

I cannot understand this last line. The LHS is a vector of all FALSE values. The RHS is a vector of 20 numbers selected from the vector 1:100. I don't understand this kind of assignment. How is this result in clinical.trial$age getting some NA values? Does this kind of assignment have a name? At best I would say that the boolean vector on the RHS gets numbers assigned to it with recycling.

Outsole answered 14/6, 2017 at 12:13 Comment(5)
The LHS is a vector of all FALSE values (since no NA is present).Penetralia
Interesting! so If x <- 1:3 then is.na(x) <- 2 seems like we are solving the x[2] <- NA with respect of 2Catholicize
The is.na<- behavior is described in the respective help. But I agree that this usage is far from "intuitive"...Cyanide
Is anyone aware if there are more functions including that functionality (colnames(), class(),....)? I would be interested in understanding why this is done (apparently only in some functions(?), but not all) instead of restating the existing docu,....Ebner
What do you mean by "that functionality"? An assignment variant of a function? It's just convenient. Most important example is `[<-` (subset assignment).Nailbrush
G
5

is.na(x) <- value is translated as 'is.na<-'(x, value).

You can think of 'is.na<-'(x, value) as 'assign NA to x, at position value'.

A perhaps better and intuitive phrasing could be assign_NA(to = x, pos = value).


Regarding other similar function, we can find those in the base package:

x <- as.character(lsf.str("package:base"))
x[grep('<-', x)]
#>  [1] "$<-"                     "$<-.data.frame"         
#>  [3] "@<-"                     "[[<-"                   
#>  [5] "[[<-.data.frame"         "[[<-.factor"            
#>  [7] "[[<-.numeric_version"    "[<-"                    
#>  [9] "[<-.data.frame"          "[<-.Date"               
#> [11] "[<-.factor"              "[<-.numeric_version"    
#> [13] "[<-.POSIXct"             "[<-.POSIXlt"            
#> [15] "<-"                      "<<-"                    
#> [17] "attr<-"                  "attributes<-"           
#> [19] "body<-"                  "class<-"                
#> [21] "colnames<-"              "comment<-"              
#> [23] "diag<-"                  "dim<-"                  
#> [25] "dimnames<-"              "dimnames<-.data.frame"  
#> [27] "Encoding<-"              "environment<-"          
#> [29] "formals<-"               "is.na<-"                
#> [31] "is.na<-.default"         "is.na<-.factor"         
#> [33] "is.na<-.numeric_version" "length<-"               
#> [35] "length<-.factor"         "levels<-"               
#> [37] "levels<-.factor"         "mode<-"                 
#> [39] "mostattributes<-"        "names<-"                
#> [41] "names<-.POSIXlt"         "oldClass<-"             
#> [43] "parent.env<-"            "regmatches<-"           
#> [45] "row.names<-"             "row.names<-.data.frame" 
#> [47] "row.names<-.default"     "rownames<-"             
#> [49] "split<-"                 "split<-.data.frame"     
#> [51] "split<-.default"         "storage.mode<-"         
#> [53] "substr<-"                "substring<-"            
#> [55] "units<-"                 "units<-.difftime"

All works the same in the sense that 'fun<-'(x, val) is equivalent to fun(x) <- val. But after that they all behave like any normal functions.


R manuals: 3.4.4 Subset assignment

Glomma answered 14/6, 2017 at 12:24 Comment(5)
See also cran.r-project.org/doc/manuals/r-release/…. The language definition is worth reading it.Nailbrush
@GGamba, I don't see any function like assign_NA() so I guess you were just trying to explain the functionality. But I don't get the original code. Is there an alternative way to do this that is not cryptic and just uses basic simple R? It seems strange to put this code in a tutorial on a basic function like table()Outsole
@Outsole Please follow the link I've provided in my comment above. This is "basic simple R". You are using this every time when you do something like x[1] <- 0.Nailbrush
@Roland, in x[1]<-0 , a value is being replaced. In is.na(clinical.trial$age) <- sample(1:100, 20), the LHS is not the age variable. The LHS is a boolean vector. So that is being replaced with numbers. I don't see how this will affect the age variable.Outsole
I'm talking about syntax here not about what these functions do. They have only assignment in common. The language definition clearly explains how this syntax is interpreted by the parser. Other than the syntax, which is a core part of the language as I tried to explain with the example [<-, there is nothing special here. You can easily define your own fun<-.Nailbrush
C
0

The help tells us, that:

(xx <- c(0:4)) 
is.na(xx) <- c(2, 4)
xx                     #> 0 NA  2 NA  4

So,

is.na(xx) <- 1

behaves more like

set NA at position 1 on variable xx
Crural answered 14/6, 2017 at 12:27 Comment(1)
I can see (xx <- c(0:4)) is.na(xx) <- c(2, 4) in the help but I have no idea why this works or what this kind of assignment is called.Outsole
D
0

@matt, to respond to your question asked above in the comments, here's an alternative way to do the same assignment that I think is easier to follow :-)

clinical.trial$age[sample(1:100, 20)] <- NA

Downtrodden answered 28/10, 2017 at 21:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.