Function to change blanks to NA
Asked Answered
L

4

12

I'm trying to write a function that turns empty strings into NA. A summary of one of my column looks like this:

      a   b 
 12 210 468 

I'd like to change the 12 empty values to NA. I also have a few other factor columns for which I'd like to change empty values to NA, so I borrowed some stuff from here and there to come up with this:

# change nulls to NAs
nullToNA <- function(df){

  # split df into numeric & non-numeric functions
  a<-df[,sapply(df, is.numeric), drop = FALSE]
  b<-df[,sapply(df, Negate(is.numeric)), drop = FALSE]

  # Change empty strings to NA
  b<-b[lapply(b,function(x) levels(x) <- c(levels(x), NA) ),] # add NA level
  b<-b[lapply(b,function(x) x[x=="",]<- NA),]                 # change Null to NA

  # Put the columns back together
  d<-cbind(a,b)
  d[, names(df)]
}

However, I'm getting this error:

> foo<-nullToNA(bar)  
Error in x[x == "", ] <- NA : incorrect number of subscripts on matrix  
Called from: FUN(X[[i]], ...)

I have tried the answer found here: Replace all 0 values to NA but it changes all my columns to numeric values.

Libidinous answered 2/11, 2016 at 11:38 Comment(2)
why not the is.null() function instead of x==""? Maybe there is nothing to be found. Have you checked whether your levels returns anything. you can check the inside of your function step by step with your data. ignore the function and go line by line with your data the inside of your function.Eran
Possible duplicate of Replace all 0 values to NAOverview
P
12

You can directly index fields that match a logical criterion. So you can just write:

df[is_empty(df)] = NA

Where is_empty is your comparison, e.g. df == "":

df[df == ""] = NA

But note that is.null(df) won’t work, and would be weird anyway1. I would advise against merging the logic for columns of different types, though! Instead, handle them separately.


1 You’ll almost never encounter NULL inside a table since that only works if the underlying vector is a list. You can create matrices and data.frames with this constraint, but then is.null(df) will never be TRUE because the NULL values are wrapped inside the list).

Pius answered 2/11, 2016 at 11:57 Comment(3)
is_empty is not a function, but I used b[b==""] = NA and that worked.Libidinous
@TravisHeeter I used is_empty as an arbitrary placeholder.Pius
Adapted above solution to: df[df == "NULL"] <- NA for my problem and it still works! +1 Thank you @KonradRudolphFlyleaf
G
3

This worked for me

    df[df == 'NULL'] <- NA
Garber answered 22/6, 2019 at 23:56 Comment(0)
E
1

How about just:

df[apply(df, 2, function(x) x=="")] = NA

Works fine for me, at least on simple examples.

Edisonedit answered 2/11, 2016 at 11:52 Comment(4)
(1) ""NULL! (2) apply isn’t needed.Pius
Agree with (2), I overcomplicated it :) But can you even have NULL values in R vectors?.. Anyway, OP's example function is looking for empty strings, so I figured that's what he wanted to replace.Edisonedit
Admittedly having NULL values in tables is rare. It only works if the underlying (column) vector is a list.Pius
Not so weird, at least not anymore. The tidyverse function pivot_wider puts NULL in for missing values.Eolic
C
0

This is the function I used to solve this issue.

null_na=function(vector){
  new_vector=rep(NA,length(vector))
  for(i in 1:length(vector))
    if(vector[i]== ""){new_vector[i]=NA}else if(is.na(vector[i])) 
      {new_vector[i]=NA}else{new_vector[i]=vector[i]}
  return(new_vector)
}

Just plug in the column or vector you are having an issue with.

Cromorne answered 11/9, 2019 at 0:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.