My data seems a bit different than other similar kind of posts.
box_num date x y
1-Q 2018-11-18 20.2 8
1-Q 2018-11-25 21.23 7.2
1-Q 2018-12-2 21.23 23
98-L 2018-11-25 0.134 9.3
98-L 2018-12-2 0.134 4
76-GI 2018-12-2 22.734 4.562
76-GI 2018-12-9 28 4.562
Here I would like to replace the repeated values with NA in both x and y columns. The code I have tried using dplyr :
(1)df <- df %>% group_by(box_num) %>% arrange(box_num,date) %>%
mutate(df$x[duplicated(df$x),] <- NA)
It creates a new column with all NA's instead of just replacing a repeated value with NA
(2)df <- df %>% group_by(box_num) %>% arrange(box_num,date) %>%
distinct(x,.keep_all = TRUE)
The second one just gives the rows that are not duplicated(we are missing the time series) Desired Output :
box_num date x y
1-Q 2018-11-18 20.2 8
1-Q 2018-11-25 21.23 7.2
1-Q 2018-12-2 NA 23
98-L 2018-11-25 0.134 9.3
98-L 2018-12-2 NA 4
76-GI 2018-12-2 22.734 4.562
76-GI 2018-12-9 28 NA