Why doesn't tidyr:fill replace my NA values
Asked Answered
C

3

6

tidyr::fill() isn't filling values in my tibble. Here is a reprex:

library(tidyverse)
library(googlesheets4)

url <- "https://docs.google.com/spreadsheets/d/1q5gdePANXci8enuiS4oHUJxcxC13d6bjMRSicakychE/edit#gid=1437767505"

gd_orig <- read_sheet(url) 

gd_orig %>%
  select(State, Date, matches("^Tests")) %>% 
  group_by(State, Date) %>%
  arrange(State, Date) %>%
  fill(`Tests conducted (negative)`,
       `Tests conducted (total)`, .direction = "down") 

This produces:

# A tibble: 504 x 4
# Groups:   State, Date [455]
   State Date                `Tests conducted (negative)` `Tests conducted (total)`
   <chr> <dttm>                                     <dbl>                     <dbl>
 1 ACT   2020-03-12 00:00:00                           NA                        NA
 2 ACT   2020-03-13 00:00:00                           NA                        NA
 3 ACT   2020-03-14 00:00:00                           NA                        NA
 4 ACT   2020-03-16 00:00:00                           NA                        NA
 5 ACT   2020-03-18 00:00:00                           NA                        NA
 6 ACT   2020-03-18 00:00:00                           NA                        NA
 7 ACT   2020-03-19 00:00:00                         1853                      1857
 8 ACT   2020-03-20 00:00:00                         2056                        NA
 9 ACT   2020-03-21 00:00:00                         2212                      2221
10 ACT   2020-03-22 00:00:00                         2395                        NA

I expect the right-most column row 8 to have been replaced with 1857, and row 10 with 2221. What am I doing wrong that this isn't working?

Things I've tried that made no difference:

  • renaming the columns to legal data.frame names eg test_neg and test_tot
  • explicitly setting .direction

session info:

> devtools::session_info()
- Session info ---------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 4.0.0 (2020-04-24)
 os       Windows >= 8 x64            
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       Australia/Sydney            
 date     2020-05-02                  

- Packages -------------------------------------------------------------------------------------------------------
 package       * version date       lib source                               
 askpass         1.1     2019-01-13 [1] CRAN (R 4.0.0)                       
 assertthat      0.2.1   2019-03-21 [1] CRAN (R 4.0.0)                       
 audio           0.1-7   2020-03-09 [1] CRAN (R 4.0.0)                       
 backports       1.1.6   2020-04-05 [1] CRAN (R 4.0.0)                       
 beepr         * 1.3     2018-06-04 [1] CRAN (R 4.0.0)                       
 broom           0.5.6   2020-04-20 [1] CRAN (R 4.0.0)                       
 Cairo         * 1.5-12  2020-04-11 [1] CRAN (R 4.0.0)                       
 callr           3.4.3   2020-03-28 [1] CRAN (R 4.0.0)                       
 cellranger      1.1.0   2016-07-27 [1] CRAN (R 4.0.0)                       
 cli             2.0.2   2020-02-28 [1] CRAN (R 4.0.0)                       
 colorspace      1.4-1   2019-03-18 [1] CRAN (R 4.0.0)                       
 crayon          1.3.4   2017-09-16 [1] CRAN (R 4.0.0)                       
 curl            4.3     2019-12-02 [1] CRAN (R 4.0.0)                       
 DBI             1.1.0   2019-12-15 [1] CRAN (R 4.0.0)                       
 dbplyr          1.4.3   2020-04-19 [1] CRAN (R 4.0.0)                       
 desc            1.2.0   2018-05-01 [1] CRAN (R 4.0.0)                       
 devtools        2.3.0   2020-04-10 [1] CRAN (R 4.0.0)                       
 digest          0.6.25  2020-02-23 [1] CRAN (R 4.0.0)                       
 dplyr         * 0.8.5   2020-03-07 [1] CRAN (R 4.0.0)                       
 ellipsis        0.3.0   2019-09-20 [1] CRAN (R 4.0.0)                       
 evaluate        0.14    2019-05-28 [1] CRAN (R 4.0.0)                       
 extrafont     * 0.17    2014-12-08 [1] CRAN (R 4.0.0)                       
 extrafontdb     1.0     2012-06-11 [1] CRAN (R 4.0.0)                       
 fansi           0.4.1   2020-01-08 [1] CRAN (R 4.0.0)                       
 forcats       * 0.5.0   2020-03-01 [1] CRAN (R 4.0.0)                       
 frs           * 0.6.3   2020-04-25 [1] Github (ellisp/frs-r-package@6628329)
 fs              1.4.1   2020-04-04 [1] CRAN (R 4.0.0)                       
 gargle          0.4.0   2019-10-04 [1] CRAN (R 4.0.0)                       
 gdtools         0.2.2   2020-04-03 [1] CRAN (R 4.0.0)                       
 generics        0.0.2   2018-11-29 [1] CRAN (R 4.0.0)                       
 ggplot2       * 3.3.0   2020-03-05 [1] CRAN (R 4.0.0)                       
 ggrepel       * 0.8.2   2020-03-08 [1] CRAN (R 4.0.0)                       
 glue            1.4.0   2020-04-03 [1] CRAN (R 4.0.0)                       
 googlesheets4 * 0.1.1   2020-03-21 [1] CRAN (R 4.0.0)                       
 gtable          0.3.0   2019-03-25 [1] CRAN (R 4.0.0)                       
 haven           2.2.0   2019-11-08 [1] CRAN (R 4.0.0)                       
 hms             0.5.3   2020-01-08 [1] CRAN (R 4.0.0)                       
 htmltools       0.4.0   2019-10-04 [1] CRAN (R 4.0.0)                       
 httr            1.4.1   2019-08-05 [1] CRAN (R 4.0.0)                       
 jsonlite        1.6.1   2020-02-02 [1] CRAN (R 4.0.0)                       
 knitr           1.28    2020-02-06 [1] CRAN (R 4.0.0)                       
 lattice         0.20-41 2020-04-02 [2] CRAN (R 4.0.0)                       
 lifecycle       0.2.0   2020-03-06 [1] CRAN (R 4.0.0)                       
 lubridate       1.7.8   2020-04-06 [1] CRAN (R 4.0.0)                       
 magrittr        1.5     2014-11-22 [1] CRAN (R 4.0.0)                       
 memoise         1.1.0   2017-04-21 [1] CRAN (R 4.0.0)                       
 modelr          0.1.6   2020-02-22 [1] CRAN (R 4.0.0)                       
 munsell         0.5.0   2018-06-12 [1] CRAN (R 4.0.0)                       
 nlme            3.1-147 2020-04-13 [2] CRAN (R 4.0.0)                       
 openssl         1.4.1   2019-07-18 [1] CRAN (R 4.0.0)                       
 pillar          1.4.3   2019-12-20 [1] CRAN (R 4.0.0)                       
 pkgbuild        1.0.6   2019-10-09 [1] CRAN (R 4.0.0)                       
 pkgconfig       2.0.3   2019-09-22 [1] CRAN (R 4.0.0)                       
 pkgload         1.0.2   2018-10-29 [1] CRAN (R 4.0.0)                       
 prettyunits     1.1.1   2020-01-24 [1] CRAN (R 4.0.0)                       
 processx        3.4.2   2020-02-09 [1] CRAN (R 4.0.0)                       
 ps              1.3.2   2020-02-13 [1] CRAN (R 4.0.0)                       
 purrr         * 0.3.4   2020-04-17 [1] CRAN (R 4.0.0)                       
 R6              2.4.1   2019-11-12 [1] CRAN (R 4.0.0)                       
 Rcpp            1.0.4.6 2020-04-09 [1] CRAN (R 4.0.0)                       
 readr         * 1.3.1   2018-12-21 [1] CRAN (R 4.0.0)                       
 readxl          1.3.1   2019-03-13 [1] CRAN (R 4.0.0)                       
 remotes         2.1.1   2020-02-15 [1] CRAN (R 4.0.0)                       
 reprex          0.3.0   2019-05-16 [1] CRAN (R 4.0.0)                       
 rlang           0.4.5   2020-03-01 [1] CRAN (R 4.0.0)                       
 rmarkdown     * 2.1     2020-01-20 [1] CRAN (R 4.0.0)                       
 rprojroot       1.3-2   2018-01-03 [1] CRAN (R 4.0.0)                       
 rstudioapi      0.11    2020-02-07 [1] CRAN (R 4.0.0)                       
 Rttf2pt1        1.3.8   2020-01-10 [1] CRAN (R 4.0.0)                       
 rvest           0.3.5   2019-11-08 [1] CRAN (R 4.0.0)                       
 scales        * 1.1.0   2019-11-18 [1] CRAN (R 4.0.0)                       
 sessioninfo     1.1.1   2018-11-05 [1] CRAN (R 4.0.0)                       
 showtext      * 0.7-1   2020-01-27 [1] CRAN (R 4.0.0)                       
 showtextdb    * 2.0     2017-09-11 [1] CRAN (R 4.0.0)                       
 stringi         1.4.6   2020-02-17 [1] CRAN (R 4.0.0)                       
 stringr       * 1.4.0   2019-02-10 [1] CRAN (R 4.0.0)                       
 svglite       * 1.2.3   2020-02-07 [1] CRAN (R 4.0.0)                       
 sysfonts      * 0.8     2018-10-11 [1] CRAN (R 4.0.0)                       
 systemfonts     0.2.0   2020-04-16 [1] CRAN (R 4.0.0)                       
 testthat        2.3.2   2020-03-02 [1] CRAN (R 4.0.0)                       
 tibble        * 3.0.1   2020-04-20 [1] CRAN (R 4.0.0)                       
 tidyr         * 1.0.2   2020-01-24 [1] CRAN (R 4.0.0)                       
 tidyselect      1.0.0   2020-01-27 [1] CRAN (R 4.0.0)                       
 tidyverse     * 1.3.0   2019-11-21 [1] CRAN (R 4.0.0)                       
 usethis         1.6.0   2020-04-09 [1] CRAN (R 4.0.0)                       
 utf8            1.1.4   2018-05-24 [1] CRAN (R 4.0.0)                       
 vctrs           0.2.4   2020-03-10 [1] CRAN (R 4.0.0)                       
 withr           2.2.0   2020-04-20 [1] CRAN (R 4.0.0)                       
 xfun            0.13    2020-04-13 [1] CRAN (R 4.0.0)                       
 xml2            1.3.2   2020-04-23 [1] CRAN (R 4.0.0)                       
 yaml            2.2.1   2020-02-01 [1] CRAN (R 4.0.0)   
Chromato answered 1/5, 2020 at 23:1 Comment(3)
Not that it matters for the purpose of my question (now solved), but fill() is not satisfactory here anyway as a way of replacing those NA values; as that results in total tests being less than negative in many instances.Chromato
just came upon this because I have the same problem and I am using fill(). I don't understand your rationale for saying that fill() is "not satisfactory here"?Ethelynethene
I did get my script to work by removing the date from the group_by (from your answer below), but I as above, I don't understand why that works. Care to comment?Ethelynethene
O
1

We can change the .direction to downup or updown as the output shows the NA are at the beginning and if we use "down" as option, it will fill the NA values with the preceding non-NA so the ones at the top remains as such because there are no preceding non-NA. With 'downup', it will first do the fill in the downward direction that is filling NA with non-NA preceding, then does the reverse in upward i.e. filling NA with non-NA succeeding. Also, with using 'Date' as one of the grouping columns, there are some groups with only NA and that would make the fill to return NA. In this case, perhaps, we can group by only 'State'

library(dplyr)
library(tidyr)
gd_orig %>%
    select(State, Date, matches("^Tests")) %>% 
    group_by(State) %>%
    arrange(State, Date) %>%
    fill(`Tests conducted (negative)`,
          `Tests conducted (total)`, .direction = "downup") 
Occur answered 1/5, 2020 at 23:1 Comment(6)
This doesn't work for me. Does it work for you? Perhaps it is a version thing. I will post my sessionInfo.Chromato
@PeterEllis i noticed that you are also grouping by 'Date' and if I use your data, some of them have only NA. What would you expect for that. I would remove the 'Date' from group_byOccur
@PeterEllis your versions are correct. It is the group by and .directionOccur
Is there another circumstance in which this will not work? I have a plain dataframe, and want to fill down character values. No error, but also no filled values. Ugh.Horrific
@Horrific please check whether you have NA or "NA"Occur
@Occur That was it. Thanks! I set my column to NA like this: ifelse(column=="",NA,column) Thanks!Horrific
C
0

The error was that I was grouping by date. So the NA had nothing to be replaced with.

Chromato answered 1/5, 2020 at 23:8 Comment(2)
it would have, but the .direction is a red herring. I will accept it as the answer now you've edited it to include the date.Chromato
I do not think grouping matters. For instance, there is no (explicit) grouping in the example below, and still fill() doesn't work. test_list <- list(A=tibble(a1 = 1:2, a2 = c("alpha", NA)), B = tibble(b1 = c(1, NA), b2 = c(NA, "two"))) test_list %>% map(., ~ fill(., .direction = "downup"))Danais
C
0

I couldn't replicate your code without adding my email to tidyverse API but this should work. fill() takes the column to fill as an argument. A subset of a dataframe wouldnt work. Instead, add the column's index. Heres an example:

library(tidyr)
a <- c(4, 4, 4, 4, 4, NA)
b <- c(2, 2, 2, 2, NA, 2)
c <- c(1, NA, 1, 1, 1, 1)
df <- data.frame(a, b, c)
tidyr::fill(df, 1:3, .direction = 'down')

Try something like this out!

Conley answered 23/10, 2022 at 23:58 Comment(1)
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From ReviewCicelycicenia

© 2022 - 2025 — McMap. All rights reserved.