Replace missing values (NA) with blank (empty string)

Asked 25/10, 2013 at 14:33 Answered 2/4, 2023 at 16:6

I have a dataframe with an NA row:

 df = data.frame(c("classA", NA, "classB"), t(data.frame(rep("A", 5), rep(NA, 5), rep("B", 5))))
 rownames(df) <- c(1,2,3)
 colnames(df) <- c("class", paste("Year", 1:5, sep = ""))

 > df
   class Year1 Year2 Year3 Year4 Year5
1 classA     A     A     A     A     A
2   <NA>  <NA>  <NA>  <NA>  <NA>  <NA>
3 classB     B     B     B     B     B

I introduced the empty row (NA row) on purpose because I wanted to have some space between classA row and classB row.

Now, I would like to substitute the <NA> by blank, so that the second row looks like an empty row.

I tried:

 df[is.na(df)] <- ""

and

 df[df == "NA"] <- ""

but it didn't work..

Any ideas? Thanks!

Jacquelyn answered 25/10, 2013 at 14:33 Comment(13)

Your first attempt works just fine for me. What about it didn't work? – Willenewillet 25/10, 2013 at 14:38

I still see <NA> in the dataframe, the code doesn't seem to affect anything – Jacquelyn 25/10, 2013 at 14:38

It to do with factors (of course!)... try str(df) (I jumped the gun on my answer!) – Therein 25/10, 2013 at 14:39

Gah. I basically have to forget I'm running stringsAsFactors = FALSE once every morning on SO. Listen to Simon. – Willenewillet 25/10, 2013 at 14:39

@SimonO101 Your answer is right on! Factors, I always forget about those.. Thanks! – Jacquelyn 25/10, 2013 at 14:41

By the way, never just say "it didn't work". You neglected to mention the six (!) warning messages you surely received upon running that code. The warning message should have been awfully suggestive, don't you think? – Willenewillet 25/10, 2013 at 14:42

@Jilber is right really. I typed up an embarssingly wrong answer! Lucky SO doesn't keep the edit history I deleted it so quick! (Hopefully) :-) – Therein 25/10, 2013 at 14:43

@Willenewillet I didn't receive a single error message... – Jacquelyn 25/10, 2013 at 14:45

The brackets around the <NA> indicate that they are not strings. Have a look HERE for more info. – Certify 25/10, 2013 at 14:45

@RicardoSaporta I should really remember it is that way round considering I upvoted that answer before. – Therein 25/10, 2013 at 14:46

@RicardoSaporta Thanks! Nice tip to remember! – Jacquelyn 25/10, 2013 at 14:48

I said warning, not error. They are different. And R 3.0.1 most definitely throws 6 warning messages upon running your code. – Willenewillet 25/10, 2013 at 14:50

Weird.. I didn't receive any warnings, not errors. – Jacquelyn 25/10, 2013 at 14:54

Another alternative:

df <- sapply(df, as.character) # since your values are `factor`
df[is.na(df)] <- 0

If you want blanks instead of zeroes

> df <- sapply(df, as.character)
> df[is.na(df)] <- " "
> df
     class    Year1 Year2 Year3 Year4 Year5
[1,] "classA" "A"   "A"   "A"   "A"   "A"  
[2,] " "      " "   " "   " "   " "   " "  
[3,] "classB" "B"   "B"   "B"   "B"   "B"

If you want a data.frame, then just use as.data.drame

> as.data.frame(df)
   class Year1 Year2 Year3 Year4 Year5
1 classA     A     A     A     A     A
2                                     
3 classB     B     B     B     B     B

Roede answered 25/10, 2013 at 14:38 Comment(2)

I thought " " is space and "" is blank. Am i right? – Evulsion 29/4, 2020 at 17:31

Carefull if you are replacing NAs with blanks (""). the conversion back to data.frame will introduce NAs again. I found that the safest is to replace NAs directly without converting the data frame to a character matrix. – Tauro 29/10, 2021 at 18:42

This answer is more of an extended comment.

What you're trying to do isn't what I would consider good practice. R is not, say, Excel, so doing something like this just to create visual separation in your data is just going to give you a headache later on down the line.

If you really only cared about the visual output, I can offer two suggestions:

Use the na.print argument to print when you want to view the data with that visual separation.

print(df, na.print = "")
#    class Year1 Year2 Year3 Year4 Year5
# 1 classA     A     A     A     A     A
# 2                                     
# 3 classB     B     B     B     B     B

Realize that even the above is not the best suggestion. Get both visual and content separation by converting your data.frame to a list:

split(df, df$class)
# $classA
#    class Year1 Year2 Year3 Year4 Year5
# 1 classA     A     A     A     A     A
# 
# $classB
#    class Year1 Year2 Year3 Year4 Year5
# 3 classB     B     B     B     B     B

Chapen answered 25/10, 2013 at 17:55 Comment(1)

for na.printto work, the dataframe columns must be character now. if they are not, convert the dataframe by dplyr::mutate(across(everything(), as.character)) – Transferor 9/12, 2021 at 12:52

Here is a dplyr option where you mutate across all the columns (everything()), where you replace in each column (.x) the NA value with an empty space like this:

library(dplyr)
df %>%
  mutate(across(everything(), ~ replace(.x, is.na(.x), "")))
#>    class Year1 Year2 Year3 Year4 Year5
#> 1 classA     A     A     A     A     A
#> 2                                     
#> 3 classB     B     B     B     B     B

^{Created on 2023-04-02 with reprex v2.0.2}

Roundish answered 2/4, 2023 at 16:6 Comment(0)

Recommended topics

Hot tags