Function to change blanks to NA

Asked 2/11, 2016 at 11:38 Answered 11/9, 2019 at 0:12

I'm trying to write a function that turns empty strings into NA. A summary of one of my column looks like this:

      a   b 
 12 210 468

I'd like to change the 12 empty values to NA. I also have a few other factor columns for which I'd like to change empty values to NA, so I borrowed some stuff from here and there to come up with this:

# change nulls to NAs
nullToNA <- function(df){

  # split df into numeric & non-numeric functions
  a<-df[,sapply(df, is.numeric), drop = FALSE]
  b<-df[,sapply(df, Negate(is.numeric)), drop = FALSE]

  # Change empty strings to NA
  b<-b[lapply(b,function(x) levels(x) <- c(levels(x), NA) ),] # add NA level
  b<-b[lapply(b,function(x) x[x=="",]<- NA),]                 # change Null to NA

  # Put the columns back together
  d<-cbind(a,b)
  d[, names(df)]
}

However, I'm getting this error:

> foo<-nullToNA(bar)  
Error in x[x == "", ] <- NA : incorrect number of subscripts on matrix  
Called from: FUN(X[[i]], ...)

I have tried the answer found here: Replace all 0 values to NA but it changes all my columns to numeric values.

Libidinous answered 2/11, 2016 at 11:38 Comment(2)

why not the is.null() function instead of x==""? Maybe there is nothing to be found. Have you checked whether your levels returns anything. you can check the inside of your function step by step with your data. ignore the function and go line by line with your data the inside of your function. – Eran 2/11, 2016 at 11:56

Possible duplicate of Replace all 0 values to NA – Overview 2/11, 2016 at 12:4

You can directly index fields that match a logical criterion. So you can just write:

df[is_empty(df)] = NA

Where is_empty is your comparison, e.g. df == "":

df[df == ""] = NA

But note that is.null(df) won’t work, and would be weird anyway¹. I would advise against merging the logic for columns of different types, though! Instead, handle them separately.

¹ You’ll almost never encounter NULL inside a table since that only works if the underlying vector is a list. You can create matrices and data.frames with this constraint, but then is.null(df) will never be TRUE because the NULL values are wrapped inside the list).

Pius answered 2/11, 2016 at 11:57 Comment(3)

is_empty is not a function, but I used b[b==""] = NA and that worked. – Libidinous 2/11, 2016 at 12:30

@TravisHeeter I used is_empty as an arbitrary placeholder. – Pius 2/11, 2016 at 12:57

Adapted above solution to: df[df == "NULL"] <- NA for my problem and it still works! +1 Thank you @KonradRudolph – Flyleaf 11/8, 2021 at 22:30

This worked for me

    df[df == 'NULL'] <- NA

Garber answered 22/6, 2019 at 23:56 Comment(0)

How about just:

df[apply(df, 2, function(x) x=="")] = NA

Works fine for me, at least on simple examples.

Edisonedit answered 2/11, 2016 at 11:52 Comment(4)

(1) "" ≠ NULL! (2) apply isn’t needed. – Pius 2/11, 2016 at 11:55

Agree with (2), I overcomplicated it :) But can you even have NULL values in R vectors?.. Anyway, OP's example function is looking for empty strings, so I figured that's what he wanted to replace. – Edisonedit 2/11, 2016 at 12:0

Admittedly having NULL values in tables is rare. It only works if the underlying (column) vector is a list. – Pius 2/11, 2016 at 12:4

Not so weird, at least not anymore. The tidyverse function pivot_wider puts NULL in for missing values. – Eolic 11/1, 2021 at 17:51

This is the function I used to solve this issue.

null_na=function(vector){
  new_vector=rep(NA,length(vector))
  for(i in 1:length(vector))
    if(vector[i]== ""){new_vector[i]=NA}else if(is.na(vector[i])) 
      {new_vector[i]=NA}else{new_vector[i]=vector[i]}
  return(new_vector)
}

Just plug in the column or vector you are having an issue with.

Cromorne answered 11/9, 2019 at 0:12 Comment(0)

Recommended topics

Hot tags