Spread an integer over several rows as many times as it is divided by a constant
Asked Answered
F

3

5

I have a dataframe

       Date      repair     
 <date>           <dbl>        
 2018-07-01        4420    
 2018-07-02          NA   
 2018-07-03          NA
 2018-07-04          NA
 2018-07-05          NA

Where 4420 is time in minutes. I'm trying to get this:

       Date      repair     
 <date>           <dbl>        
 2018-07-01        1440    
 2018-07-02        1440   
 2018-07-03        1440
 2018-07-04         100
 2018-07-05          NA

Where 1440 - minutes in one day and 100 what is left. I made it with loop. Can this be achieved in a more elegant way?

Finney answered 6/2, 2019 at 13:52 Comment(4)
There could be many edge cases for this question but for starters can you clarify 1) What would be your output when repair = c(4420, NA, NA, 4420, NA) ? 2) Output for repair = c(4420, 100, NA, 4420, NA). Is the above two inputs possible or they will never occur?Salena
They will never occur. Overlapping is impossible.Finney
I'm sorry. This scenario repair = c(4420, 100, NA, 4420, NA) is possible.Finney
@DmytroFedoriuk I might be best to ask a new question then.Hunterhunting
T
2

With dplyr:

library(dplyr)

df %>%
  mutate(
    repair = c(rep(1440, floor(repair[1] / 1440)), 
               repair[1] %% 1440, 
               rep(NA, n() - length(c(rep(1440, floor(repair[1] / 1440)), repair[1] %% 1440))))
  )

Output:

        Date repair
1 2018-07-01   1440
2 2018-07-02   1440
3 2018-07-03   1440
4 2018-07-04    100
5 2018-07-05     NA
Transcend answered 6/2, 2019 at 14:23 Comment(2)
It's a great solution! But if the first number of df is the multiple of 1440, such as 4320, your output will be 1440 1440 1440 0 NA, not 1440 1440 1440 NA NA.Koniology
Thanks, you're completely right; however I'm not sure about what OP wants in this case, and from what I can see there are some additional requirements to the question (for which I feel additional elaboration is needed).Transcend
H
2

You could write a little function for that task

f <- function(x, y, length_out) {
  remainder <- x %% y 
  if(remainder == 0) {
    `length<-`(rep(y, x %/% y), length_out)
  } else {
    `length<-`(c(rep(y, x %/% y), remainder), length_out)
  }
}

Input

x <- 4420
y <- 24 * 60

Result

f(x, y, length_out = 10)
# [1] 1440 1440 1440  100   NA   NA   NA   NA   NA   NA

length_out should probably be equal to nrow(your_data)

Hunterhunting answered 6/2, 2019 at 14:5 Comment(0)
T
2

With dplyr:

library(dplyr)

df %>%
  mutate(
    repair = c(rep(1440, floor(repair[1] / 1440)), 
               repair[1] %% 1440, 
               rep(NA, n() - length(c(rep(1440, floor(repair[1] / 1440)), repair[1] %% 1440))))
  )

Output:

        Date repair
1 2018-07-01   1440
2 2018-07-02   1440
3 2018-07-03   1440
4 2018-07-04    100
5 2018-07-05     NA
Transcend answered 6/2, 2019 at 14:23 Comment(2)
It's a great solution! But if the first number of df is the multiple of 1440, such as 4320, your output will be 1440 1440 1440 0 NA, not 1440 1440 1440 NA NA.Koniology
Thanks, you're completely right; however I'm not sure about what OP wants in this case, and from what I can see there are some additional requirements to the question (for which I feel additional elaboration is needed).Transcend
K
2

A recursive solution:

fun <- function(x, y, i = 0){
  if(x <= y) c(rep(y, i), x) else fun(x-y, y, i+1)
}

fun(4420, 1440)[1:nrow(df)]
# [1] 1440 1440 1440  100   NA
Koniology answered 6/2, 2019 at 15:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.