In R ,how can i replac the NA by the previous character [duplicate]
Asked Answered
S

5

5

there is a dataframe as blow(with NA values)

md <- data.frame(cat=c('a','b','d',NA,'E',NA),
                subcat=c('A','C',NA,NA,NA,'D')) 

 cat subcat
1    a      A
2    b      C
3    d   <NA>
4 <NA>   <NA>
5    E   <NA>
6 <NA>      D

i want to replace the NA by the previous character ,the result as below.

Using loop statement like 'for ...' can do it, but it's not that efficient . is there any formula or package can do it ? thanks!

  cat subcat
1   a      A
2   b      C
3   d      C
4   d      C
5   E      C
6   E      D
Slayton answered 12/6, 2021 at 0:29 Comment(2)
What happens if the first element of a column is NA?Heterotypic
this question just for given specific suition that firt row no NASlayton
S
6

You can use the na.locf function from the zoo package.

zoo::na.locf(md)
  cat subcat
1   a      A
2   b      C
3   d      C
4   d      C
5   E      C
6   E      D

Or use fill and everything from the tidyr and dplyr, respectively.

library(dplyr)
library(tidyr)

md %>% fill(everything())
#   cat subcat
# 1   a      A
# 2   b      C
# 3   d      C
# 4   d      C
# 5   E      C
# 6   E      D
Shortage answered 12/6, 2021 at 1:44 Comment(1)
this is grate, thanks!Slayton
V
1

One approach is to use run length encoding rle(). Because it does not encode NAs, I replaced them with a string "NA".

roll_na <- function(.) {
  .[is.na(.)] <- "NA"
  var <- rle(.)
  na_ind <- which(var$values == "NA")
  var_lag <- c(NA, var$values[-length(var$values)])
  var$values[na_ind] <- var_lag[na_ind]
  
  rep(var$values, times = var$lengths)
}

library(dplyr)

md %>% 
  mutate(across(everything(), roll_na))

#   cat subcat
# 1   a      A
# 2   b      C
# 3   d      C
# 4   d      C
# 5   E      C
# 6   E      D
Vaughnvaught answered 12/6, 2021 at 1:19 Comment(0)
H
0

Ignoring the case where NA is the initial value of a column you can use the following function

# Replacement function
func = function(DF){
    tmp = DF
    for(i in 1:length(tmp[1,])){
        for(j in 1:length(tmp[,i])){
            if(j == 1){
                next
            } else if (is.na(tmp[j,i])) {
                tmp[j,i] = tmp[j-1,i]
            }
        }
    }
    return(tmp)
}

And doing

# data 
md = func(md)
print(md)

outputs

  cat subcat
1   a      A
2   b      C
3   d      C
4   d      C
5   E      C
6   E      D
Heterotypic answered 12/6, 2021 at 0:54 Comment(0)
T
0

This is not the way to go if you have a lots of consecutive NAs in large column vectors, but it's fast if you only have a few:

no_NA <- function(x) {while(any(is.na(x))) x[is.na(x)] <- x[which(is.na(x))-1]; x}
as.data.frame(apply(md, 2, no_NA))

If you have a large dataset with lots of NAs I would go with a simple while loop which changes all the NAs from the beginning of each vector

no_NA <- function(x){
  len <- length(x); i <- 2
  while(i <= len){
    if (is.na(x[i])) x[i] <- x[i-1]
    i <- i + 1
  } 
  x
}
as.data.frame(apply(md, 2, no_NA))
Telson answered 12/6, 2021 at 1:8 Comment(1)
thans for your helpSlayton
S
0
library(tidyverse)
library(magrittr)
#> 
#> Attaching package: 'magrittr'
#> The following object is masked from 'package:purrr':
#> 
#>     set_names
#> The following object is masked from 'package:tidyr':
#> 
#>     extract

md <- data.frame(cat=c(NA,'b','d',NA,'E',NA),
                 subcat=c(NA,'C',NA,NA,NA,'D')) 

md
#>    cat subcat
#> 1 <NA>   <NA>
#> 2    b      C
#> 3    d   <NA>
#> 4 <NA>   <NA>
#> 5    E   <NA>
#> 6 <NA>      D

#if the first value is NA
value <- '0'

md <- 
    map(md, ~{
        if(is.na(.x[[1]])) {
            c(value, .x[-1])
        } else {
            .x
        }
    }) %>% bind_cols()

#while loop is needed for consecutive NA's
while (any(map_lgl(md, ~any(is.na(..1))))) { 
    md %<>% mutate(cat = if_else(is.na(cat), lag(cat), cat),
                   subcat = if_else(is.na(subcat), lag(subcat), subcat))
}

md
#> # A tibble: 6 x 2
#>   cat   subcat
#>   <chr> <chr> 
#> 1 0     0     
#> 2 b     C     
#> 3 d     C     
#> 4 d     C     
#> 5 E     C     
#> 6 E     D

Created on 2021-06-11 by the reprex package (v2.0.0)

Selfabnegation answered 12/6, 2021 at 1:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.