Is it possible to set na.rm to TRUE globally?
Asked Answered
S

4

29

For commands like max the option na.rm is set by default to FALSE. I understand why this is a good idea in general, but I'd like to turn it off reversibly for a while -- i.e. during a session.

How can I require R to set na.rm = TRUE whenever it is an option? I found

options(na.action = na.omit)

but this doesn't work. I know that I can set a na.rm=TRUE option for each and every function I write.

my.max <- function(x) {max(x, na.rm=TRUE)}

But that's not what I am looking for. I'm wondering if there's something I could do more globally/universally instead of doing it for each function.

Shannon answered 2/7, 2013 at 6:17 Comment(4)
Unfortunately, the answer you don't want is the only one that works generally. There's no global option for this like there is for na.action, which only affects modeling functions like lm, glm, etc (and even there, it isn't guaranteed to work in all cases).Makebelieve
@HongOoi - I think in light of the large number of upvotes on your comment it should be rehashed as an answer (or "the" answer potentially).Clasping
An alternative to have a fine control of where/when to omit NAs could be to include a variable such as do.omit.na = TRUE at the begining of your script, and to use it thereafter when applicable with max(x, na.rm = do.omit.na).Pish
anyone care to elaborate on why it's a good idea to set na.rm=F in general? Just as a way to flag to yourself that the sum/mean/etc that you calculate may not be exactly what you want?Tyne
S
9

It is not possible to change na.rm to TRUE globally. (See Hong Ooi's comment under the question.)

EDIT:

Unfortunately, the answer you don't want is the only one that works generally. There's no global option for this like there is for na.action, which only affects modeling functions like lm, glm, etc (and even there, it isn't guaranteed to work in all cases). – Hong Ooi Jul 2 '13 at 6:23

Shannon answered 31/7, 2013 at 23:39 Comment(0)
A
12

One workaround (dangerous), is to do the following :

  1. List all functions that have na.rm as argument. Here I limited my search to the base package.
  2. Fetch each function and add this line at the beginning of its body: na.rm = TRUE
  3. Assign the function back to the base package.

So first I store in a list (ll) all functions having na.rm as argument:

uses_arg <- function(x,arg) 
  is.function(fx <- get(x)) && 
  arg %in% names(formals(fx))
basevals <- ls(pos="package:base")      
na.rm.f <- basevals[sapply(basevals,uses_arg,'na.rm')]

EDIT better method to get all na.rm's argument functions (thanks to mnel comment)

Funs <- Filter(is.function,sapply(ls(baseenv()),get,baseenv()))
na.rm.f <- names(Filter(function(x) any(names(formals(args(x)))%in% 'na.rm'),Funs))

So na.rm.f list looks like:

 [1] "all"                     "any"                     "colMeans"                "colSums"                
 [5] "is.unsorted"             "max"                     "mean.default"            "min"                    
 [9] "pmax"                    "pmax.int"                "pmin"                    "pmin.int"               
[13] "prod"                    "range"                   "range.default"           "rowMeans"               
[17] "rowsum.data.frame"       "rowsum.default"          "rowSums"                 "sum"                    
[21] "Summary.data.frame"      "Summary.Date"            "Summary.difftime"        "Summary.factor"         
[25] "Summary.numeric_version" "Summary.ordered"         "Summary.POSIXct"         "Summary.POSIXlt" 

Then for each function I change the body, the code is inspired from data.table package (FAQ 2.23) that add one line to the start of rbind.data.frame and cbind.data.frame.

ll <- lapply(na.rm.f,function(x)
  {
  tt <- get(x)
  ss = body(tt)
  if (class(ss)!="{") ss = as.call(c(as.name("{"), ss))
  if(length(ss) < 2) print(x)
  else{
    if (!length(grep("na.rm = TRUE",ss[[2]],fixed=TRUE))) {
      ss = ss[c(1,NA,2:length(ss))]
      ss[[2]] = parse(text="na.rm = TRUE")[[1]]
      body(tt)=ss
      (unlockBinding)(x,baseenv())
      assign(x,tt,envir=asNamespace("base"),inherits=FALSE)
      lockBinding(x,baseenv())
      }
    }
  })

No if you check , the first line of each function of our list :

unique(lapply(na.rm.f,function(x) body(get(x))[[2]]))
[[1]]
na.rm = TRUE
Azotobacter answered 2/7, 2013 at 10:20 Comment(5)
Funs <- Filter(is.function,sapply(ls(baseenv()),get,baseenv())); na.rm.f <- names(Filter(function(x) any(names(formals(args(x)))%in% 'na.rm'),Funs)) will collect min and max ....Boltonia
I appreciate the answer and I think I understand what you're doing, but I can't upvote or accept because the latest answer gives Error in ss[[2]] : subscript out of bounds and the first answer gives max(5, NA) = NA.Shannon
@Shannon I edit my answer. Now you don't have an error. But, unfortunately the code don't work for primitive functions that have na.rm argument: "all" "any" "max" "min" "prod" "range" "sum" Azotobacter
or you could combine this with setDefaults from the Defaults packageFeisty
@Ben Package ‘Defaults’ was removed from the CRAN repository (...) at the request of the maintainer, who had not updated it for R 3.1.0. cran.r-project.org/web/packages/Defaults/index.htmlBallot
S
9

It is not possible to change na.rm to TRUE globally. (See Hong Ooi's comment under the question.)

EDIT:

Unfortunately, the answer you don't want is the only one that works generally. There's no global option for this like there is for na.action, which only affects modeling functions like lm, glm, etc (and even there, it isn't guaranteed to work in all cases). – Hong Ooi Jul 2 '13 at 6:23

Shannon answered 31/7, 2013 at 23:39 Comment(0)
B
8

For my R package, I overwrote the existing functions mean and sum. Thanks to the great Ben (comments below), I altered my functions to this:

mean <- function(x, ..., na.rm = TRUE) {
  base::mean(x, ..., na.rm = na.rm)
}

After this, mean(c(2, NA, 3)) = 2.5 instead of NA.

And for sum:

sum <- function(x, ..., na.rm = TRUE) {
  base::sum(x, ..., na.rm = na.rm)
}

This will yield sum(c(2, NA, 3)) = 5 instead of NA.

sum(c(2, NA, 3, NaN)) also works.


You could also make it a global option:

sum <- function(x, ..., na.rm = getOption("na.rm", default = TRUE)) {
  base::sum(x, ..., na.rm = na.rm)
}

Now you can set the default value with options(), e.g. options(na.rm = TRUE).

Ballot answered 9/6, 2017 at 9:22 Comment(4)
maybe slightly better to use base::mean(...) rather than mean.default(...) (in case you decide to take the mean of something that has a method other than the default).Feisty
Thanks! Put me on another idea :) I'll edit my answer.Ballot
Why does this not appear to work when aggregating within a data.table? Eg, dt[,.(x=mean(x)), by=Group]?Mccarron
Then it seems data.table implemented their own mean function that is called internally to increase speed, since it's such a common function? Not sure.Ballot
T
4

There were several answers about changing na.rm argument globally already. I just want to notice about partial() function from purrr or pryr packages. Using this function you can create a copy of existing function with predefined arguments:

library(purrr)
.mean <- partial(mean, na.rm = TRUE)

# Create sample vector
df <- c(1, 2, 3, 4, NA, 6, 7)

mean(df)
>[1] NA

.mean(df)
>[1] 3.833333

We can combine this tip with @agstudy answer and create copies of all functions with na.rm = TRUE argument:

library(purrr)

# Create a vector of function names https://mcmap.net/q/484424/-is-it-possible-to-set-na-rm-to-true-globally
Funs <- Filter(is.function,sapply(ls(baseenv()),get,baseenv()))
na.rm.f <- names(Filter(function(x) any(names(formals(args(x)))%in% 'na.rm'),Funs))

# Create strings. Dot "." is optional
fs <- lapply(na.rm.f,
             function(x) paste0(".", x, "=partial(", x ,", na.rm = T)"))

eval(parse(text = fs)) 

So now, there are .all, .min, .max, etc. in our .GlobalEnv. You can run them:

.min(df)
> [1] 1
.max(df)
> [1] 7
.all(df)
> [1] TRUE

To overwrite functions, just remove dot "." from lapply call. Inspired by this blogpost

Tumefacient answered 28/3, 2019 at 21:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.