Generating a time series in 10 Hz
Asked Answered
O

2

5

I like to interpolate a time series, so that the timestamp is exact 0.1 Hz. So the first step maybe would be something like

library(tidyverse) 
library(zoo)

options(digits.secs = 3, pillar.sigfig = 6)

data <- tribble(
  ~timestamp, ~value, 
  "09/12/2024 00:05:35.677", 139.664,
  "09/12/2024 00:05:35.776", 138.706,
  "09/12/2024 00:05:35.876", 143.348,
  "09/12/2024 00:05:35.975", 141.516,
  "09/12/2024 00:05:36.074", 136.731,
  "09/12/2024 00:05:36.174", 138.275,
  "09/12/2024 00:05:36.273", 143.015) %>%
  mutate(timestamp = mdy_hms(timestamp))

start <- min(data$timestamp) %>% round_date("0.1 sec")
end   <- max(data$timestamp) %>% round_date("0.1 sec")

data_10Hz <- data %>%
  complete(timestamp = seq.POSIXt(start, end, by = .100)) %>%
  arrange(timestamp) 

data_10Hz

#   timestamp                 value
#   <dttm>                    <dbl>
# 1 2024-09-12 00:05:35.677 139.664
# 2 2024-09-12 00:05:35.700  NA    
# 3 2024-09-12 00:05:35.776 138.706
# 4 2024-09-12 00:05:35.799  NA    
# 5 2024-09-12 00:05:35.875 143.348
# 6 2024-09-12 00:05:35.900  NA    
# 7 2024-09-12 00:05:35.974 141.516
# 8 2024-09-12 00:05:36.000  NA    
# 9 2024-09-12 00:05:36.073 136.731
#10 2024-09-12 00:05:36.100  NA    
#11 2024-09-12 00:05:36.174 138.275
#12 2024-09-12 00:05:36.200  NA    
#13 2024-09-12 00:05:36.273 143.015

data_10Hz <- data_10Hz  %>%
  mutate(value = na.approx(value)) %>%
  filter(timestamp == round_date(timestamp, "0.1 sec"))

data_10Hz

#   timestamp                 value
#   <dttm>                    <dbl>
# 1 2024-09-12 00:05:35.700 139.185
# 2 2024-09-12 00:05:35.799 141.027
# 3 2024-09-12 00:05:35.900 142.432
# 4 2024-09-12 00:05:36.000 139.123
# 5 2024-09-12 00:05:36.200 140.645

But that's not completely 10 Hz (problems with the internal representation of numbers) and probably slow for large data sets.

Do you know an efficient/cleaner way to do such interpolations?

Best wishes Christof

Osteoma answered 18/9 at 14:26 Comment(2)
There are packages that can deal with nanosecond precision - #64507445 - that may be helpful.Bismuthinite
Could you explain a little more what you want the output to look like? Is the main issue with exact timing? Is linear interpolation an acceptable approximation? How big is your data and how do you weigh up speed and precision?Ciel
D
5

Linear interpolation can be done with the approx function from the stats package and is compatible with datetimes.

A possible issue I see with your code is that you are calling zoo::na.approx without an x argument. I believe it is guessing the spacing by this logic from the function documentation:

By default the index associated with object is used for interpolation.

It seems likely that my solution would give the same values as zoo::na.approx if you used an x variable in that function call, but calculate faster as it doesn't require a complete and a filter.

library(tidyverse) 

options(digits.secs = 3, pillar.sigfig = 6)

data <- tribble(
  ~timestamp, ~value, 
  "09/12/2024 00:05:35.677", 139.664,
  "09/12/2024 00:05:35.776", 138.706,
  "09/12/2024 00:05:35.876", 143.348,
  "09/12/2024 00:05:35.975", 141.516,
  "09/12/2024 00:05:36.074", 136.731,
  "09/12/2024 00:05:36.174", 138.275,
  "09/12/2024 00:05:36.273", 143.015) %>%
  mutate(timestamp = mdy_hms(timestamp))

start <- min(data$timestamp) %>% round_date("0.1 sec")
end   <- max(data$timestamp) %>% round_date("0.1 sec")

data_10_hz <- tibble(
  timestamp = seq(start, end, by = 0.1),
  value = approx(data$timestamp, data$value, timestamp)$y # interpolation with `approx`
)

data_10_hz
#> # A tibble: 6 × 2
#>   timestamp                 value
#>   <dttm>                    <dbl>
#> 1 2024-09-12 00:05:35.700 139.441
#> 2 2024-09-12 00:05:35.799 139.820
#> 3 2024-09-12 00:05:35.900 142.904
#> 4 2024-09-12 00:05:36.000 140.308
#> 5 2024-09-12 00:05:36.100 137.132
#> 6 2024-09-12 00:05:36.200 139.520

Created on 2024-09-20 with reprex v2.1.0

Discretionary answered 20/9 at 18:32 Comment(1)
Updated with more detail :)Discretionary
S
2

If you are really bothered by the numeric representation, e.g., not 10Hz strictly, you can use chr type instead of dttm. When you read the DateTimeClasses from R's manual, you will see that

Classes "POSIXct" and "POSIXlt" are able to express fractions of a second. (Conversion of fractions between the two forms may not be exact, but will have better than microsecond accuracy.)

Fractional seconds are printed only if options("digits.secs") is set: see strftime.

That means, you should make extra efforts than the build-in POSIXct or POSIXlt class to escalate the precision. One option is using Python as a helper.


For example, you can define a customized formatter written in Python, and then call it via reticulate::source_python

library(reticulate)
source_python("FmtHelper.py")

# interpolation function with given `data`
fInterp <- with(data, approxfun(timestamp, value))
with(
    data,
    tibble(
        timestamp = seq(
            min(timestamp) %>% round_date("0.1 sec"),
            max(timestamp) %>% round_date("0.1 sec"),
            by = 0.1
        )
    )
) %>%
    mutate(value = fInterp(timestamp)) %>%
    mutate(timestamp = timeFormatter(as.list(timestamp))) %>% # format the `timestamp` column to `chr` type
    as_tibble()

where FmtHelper.py turns the datetime array to string array in Python

import pandas as pd

def timeFormatter(x):
    return list(
        pd.to_datetime(x).tz_localize(None).astype("datetime64[ms]").astype(str)
    )

and it will give

# A tibble: 6 × 2
  timestamp                 value
  <chr>                     <dbl>
1 2024-09-12 00:05:35.700 139.441
2 2024-09-12 00:05:35.800 139.820
3 2024-09-12 00:05:35.900 142.904
4 2024-09-12 00:05:36.000 140.308
5 2024-09-12 00:05:36.100 137.132
6 2024-09-12 00:05:36.200 139.520
Suffer answered 23/9 at 10:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.