I have a data frame containing entries; It appears that these values are not treated as NA since is.na returns FALSE. I would like to convert these values to NA but could not find the way.
The two classes where this is likely to be an issue are character and factor. This should loop over a dtaframe and convert the "NA" values into true <NA>
's but just for those two classes:
make.true.NA <- function(x) if(is.character(x)||is.factor(x)){
is.na(x) <- x=="NA"; x} else {
x}
df[] <- lapply(df, make.true.NA)
(Untested in the absence of a data example.) The use of the form: df_name[]
will attempt to retain the structure of the original dataframe which would otherwise lose its class attribute. I see that ujjwal thinks your spelling of NA has flanking "<>" characters so you might try this functions as more general:
make.true.NA <- function(x) if(is.character(x)||is.factor(x)){
is.na(x) <- x %in% c("NA", "<NA>"); x} else {
x}
dfrm[targets] <- lapply( dfrm[targets], make.true.NA) ; dfrm[targets] <- lapply( dfrm[targets], as.numeric)
–
Gottuard Use dfr[dfr=="<NA>"]=NA
where dfr
is your dataframe.
For example:
> dfr<-data.frame(A=c(1,2,"<NA>",3),B=c("a","b","c","d"))
> dfr
A B
1 1 a
2 2 b
3 <NA> c
4 3 d
> is.na(dfr)
A B
[1,] FALSE FALSE
[2,] FALSE FALSE
[3,] FALSE FALSE
[4,] FALSE FALSE
> dfr[dfr=="<NA>"] = NA **key step**
> is.na(dfr)
A B
[1,] FALSE FALSE
[2,] FALSE FALSE
[3,] TRUE FALSE
[4,] FALSE FALSE
The two classes where this is likely to be an issue are character and factor. This should loop over a dtaframe and convert the "NA" values into true <NA>
's but just for those two classes:
make.true.NA <- function(x) if(is.character(x)||is.factor(x)){
is.na(x) <- x=="NA"; x} else {
x}
df[] <- lapply(df, make.true.NA)
(Untested in the absence of a data example.) The use of the form: df_name[]
will attempt to retain the structure of the original dataframe which would otherwise lose its class attribute. I see that ujjwal thinks your spelling of NA has flanking "<>" characters so you might try this functions as more general:
make.true.NA <- function(x) if(is.character(x)||is.factor(x)){
is.na(x) <- x %in% c("NA", "<NA>"); x} else {
x}
dfrm[targets] <- lapply( dfrm[targets], make.true.NA) ; dfrm[targets] <- lapply( dfrm[targets], as.numeric)
–
Gottuard You can do this with the naniar package as well, using replace_with_na
and associated functions.
dfr <- data.frame(A = c(1, 2, "<NA>", 3), B = c("a", "b", "c", "d"))
library(naniar)
# dev version - devtools::install_github('njtierney/naniar')
is.na(dfr)
#> A B
#> [1,] FALSE FALSE
#> [2,] FALSE FALSE
#> [3,] FALSE FALSE
#> [4,] FALSE FALSE
dfr %>% replace_with_na(replace = list(A = "<NA>")) %>% is.na()
#> A B
#> [1,] FALSE FALSE
#> [2,] FALSE FALSE
#> [3,] TRUE FALSE
#> [4,] FALSE FALSE
# You can also specify how to do this for many variables
dfr %>% replace_with_na_all(~.x == "<NA>")
#> # A tibble: 4 x 2
#> A B
#> <int> <int>
#> 1 2 1
#> 2 3 2
#> 3 NA 3
#> 4 4 4
You can read more about using replace_with_na
here
© 2022 - 2024 — McMap. All rights reserved.
dput(your.data.frame[some.rows.that.contain.such.values,])
to your question. – Dumbstruckstr(your.data.frame)
would also be useful to let us see how the columns are stored. – Carbolated