tsibble -- how do you get around implicit gaps when there are none
Asked Answered
D

2

7

I am new to the tsibble package. I have monthly data that I coerced to a tsibble to use the fable package. A few issues I am having

  • It appears the index variable (from my testing) is not of class date even though I applied lubridate's ymd function to it.
  • has_gaps function returns FALSE but when I model on the data, I get the error that ".data contains implicit gaps in time"
library(dplyr)
library(fable)
library(lubridate)
library(tsibble)

test <- data.frame(
   YearMonth = c(20160101, 20160201, 20160301, 20160401, 20160501, 20160601,
                 20160701, 20160801, 20160901, 20161001, 20161101, 20161201),
      Claims = c(13032647, 1668005, 24473616, 13640769, 17891432, 11596556,
                 23176360, 7885872, 11948461, 16194792, 4971310, 18032363),
     Revenue = c(12603367, 18733242, 5862766, 3861877, 15407158, 24534258,
                 15633646, 13720258, 24944078, 13375742, 4537475, 22988443)
)

test_ts <- test %>% 
  mutate(YearMonth = ymd(YearMonth)) %>% 
  as_tsibble(
    index = YearMonth,
    regular = FALSE       #because it picks up gaps when I set it to TRUE
    )

# Are there any gaps?
has_gaps(test_ts, .full = T)

model_new <- test_ts %>% 
  model(
  snaive = SNAIVE(Claims))
Warning messages:
1: 1 error encountered for snaive
[1] .data contains implicit gaps in time. You should check your data and convert implicit gaps into explicit missing values using `tsibble::fill_gaps()` if required.

Any help will appreciated.

Domination answered 31/12, 2019 at 1:51 Comment(0)
D
7

You have a daily index, but you want a monthly index. The simplest way is to use the tsibble::yearmonth() function, but you will need to convert the date to character first.

library(dplyr)
library(tsibble)

test <- data.frame(
  YearMonth = c(20160101, 20160201, 20160301, 20160401, 20160501, 20160601,
    20160701, 20160801, 20160901, 20161001, 20161101, 20161201),
  Claims = c(13032647, 1668005, 24473616, 13640769, 17891432, 11596556,
    23176360, 7885872, 11948461, 16194792, 4971310, 18032363),
  Revenue = c(12603367, 18733242, 5862766, 3861877, 15407158, 24534258,
    15633646, 13720258, 24944078, 13375742, 4537475, 22988443)
)

test_ts <- test %>%
  mutate(YearMonth = yearmonth(as.character(YearMonth))) %>%
  as_tsibble(index = YearMonth)
Datcha answered 31/12, 2019 at 4:3 Comment(0)
C
4

Looks like as_tsibble isn't able to recognize the interval properly in the YearMonth column because it is a Date class object. It's hidden in the 'Index' section of help page that that might be problem:

For a tbl_ts of regular interval, a choice of index representation has to be made. For example, a monthly data should correspond to time index created by yearmonth or zoo::yearmon, instead of Date or POSIXct.

Like that excerpt suggests you can get around the problem with yearmonth(). But that requires a little string manipulation first to get it into a format that will parse properly.

test_ts <- test %>% 
  mutate(YearMonth = gsub("(.{2})01$", "-\\1", YearMonth) %>% 
           yearmonth()
         ) %>%
  as_tsibble(
    index = YearMonth
  )

Now the model should run error free! Not sure why the has_gaps() test is saying everything is okay in your example...

Cattish answered 31/12, 2019 at 2:27 Comment(1)
Small note, the interval of one day is appropriate for monthly data represented with date classes. This is because months have an irregular number of days, and so when represented as daily measurements the greatest common denominator is one day.Chloral

© 2022 - 2024 — McMap. All rights reserved.