Searching for linear interpolation of time series data in R, I often found recommendations to use na.approx()
from the zoo
package.
However, with irregular timeseries I experienced problems, because interpolated values are distributed evenly across the number of gaps, not taking into account the associated time stamp of the value.
I found a work around using approxfun()
, but I wonder whether there is a cleaner solution, ideally based on tsibble
objects with functions from the tidyverts
package family?
Previous answers relied on expanding the irregular date grid to a regular grid by filling the gaps. However, this causes problems when daytime should be taken into account during interpolating.
Here comes a (revised) minimal example with POSIXct timestamp rather than Date only:
library(tidyverse)
library(zoo)
df <- tibble(date = as.POSIXct(c("2000-01-01 00:00", "2000-01-02 02:00", "2000-01-05 00:00")),
value = c(1,NA,2))
df %>%
mutate(value_int_wrong = na.approx(value),
value_int_correct = approxfun(date, value)(date))
# A tibble: 3 x 4
date value value_int_wrong value_int_correct
<dttm> <dbl> <dbl> <dbl>
1 2000-01-01 00:00:00 1 1 1
2 2000-01-02 02:00:00 NA 1.5 1.27
3 2000-01-05 00:00:00 2 2 2
Any ideas how to (efficently) deal with this? Thanks for your support!