tidyverts/fasster

stream()/forecast() fails for short new_data

vshulyak opened this issue · 2 comments

Hi @mitchelloharawild,

When using a switching model, both stream() and forecast() parse the new_data in a way that is inconsistent with the already trained model.

Consider the following model:

elec_tr <- tsibbledata::vic_elec %>%
  filter(
    yearmonth(Time) == yearmonth("2012 Mar")
  ) %>% 
  mutate(WorkDay = wday(Time) %in% 2:6 & !Holiday)

elec_fit <- elec_tr %>%
  model(
    fasster = fasster(Demand ~ 
      WorkDay %S% (trig(48, 16) + poly(1)) + Temperature + I(Temperature^2)
    )
  )

elec_fit

It is trained on the boolean switch WorkDay, however if the new_data input is too short, like in this case:

elec_update <- tsibbledata::vic_elec %>%
  filter(
    yearmonth(Time) == yearmonth("2012 Apr")
  ) %>% 
  head(2) %>%
  mutate(WorkDay = wday(Time) %in% 2:6 & !Holiday)

elec_fit_upd <- elec_fit %>% stream(elec_update)
elec_fit_upd

then internal DLM states and X will have wrong dimensions as they don't account for unseen states (say, holidays/weekends when the data contains only weekdays):

Error in rbind(object$dlm$X, X) : 
  number of columns of matrices must match (see arg 2)
Calls: <Anonymous> ... .f -> stream.mdl_ts -> stream -> stream.FASSTER -> rbind

Just to try out the streaming functionality, I fixed it by patching stream() to always use longer new_data, but then tail()'ing X/est/residuals when needed. Obviously, this is extremely ugly.

Not sure how to fix it at the moment. Am I missing something? If you have any ideas to share, I'd love to try fixing it and submitting a PR.

Thanks!

This seems to be due to the generation of specials on short datasets that don't include all switching terms.
If WorkDay is always TRUE, then the varied states for FALSE will not be created. However forecast() and stream() expects them to exist, and so the dimensions do not match.

I think the required change is for a function to be written that compares the generated specials with the states of a fitted model, and fill in the missing states. Then methods that use new_data, such as stream() and forecast() can use this function to repair the incomplete structure of specials.

All fixed, thanks!

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(fasster)
#> Loading required package: fabletools
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following object is masked from 'package:lubridate':
#> 
#>     interval
elec_tr <- tsibbledata::vic_elec %>%
  filter(
    yearmonth(Time) == yearmonth("2012 Mar")
  ) %>% 
  mutate(WorkDay = wday(Time) %in% 2:6 & !Holiday)

elec_fit <- elec_tr %>%
  model(
    fasster = fasster(Demand ~ 
                        WorkDay %S% (trig(48, 16) + poly(1)) + Temperature + I(Temperature^2)
    )
  )
#> Warning: 'poly' is deprecated.
#> Use 'trend' instead.
#> See help("Deprecated")
#> Warning: 'trig' is deprecated.
#> Use 'fourier' instead.
#> See help("Deprecated")

elec_fit %>% 
  forecast(tsibbledata::vic_elec %>%
             filter(
               yearmonth(Time) == yearmonth("2012 Apr")
             ) %>% 
             mutate(WorkDay = wday(Time) %in% 2:6 & !Holiday)) %>% 
  autoplot(tsibbledata::vic_elec %>%
             filter(
               yearmonth(Time) == yearmonth("2012 Mar")
             ) )
#> Warning: Problem with `mutate()` input `fasster`.
#> x 'poly' is deprecated.
#> Use 'trend' instead.
#> See help("Deprecated")
#> ℹ Input `fasster` is `(function (object, ...) ...`.
#> Warning: 'poly' is deprecated.
#> Use 'trend' instead.
#> See help("Deprecated")
#> Warning: Problem with `mutate()` input `fasster`.
#> x 'trig' is deprecated.
#> Use 'fourier' instead.
#> See help("Deprecated")
#> ℹ Input `fasster` is `(function (object, ...) ...`.
#> Warning: 'trig' is deprecated.
#> Use 'fourier' instead.
#> See help("Deprecated")

elec_update <- tsibbledata::vic_elec %>%
  filter(
    yearmonth(Time) == yearmonth("2012 Apr")
  ) %>% 
  head(2) %>%
  mutate(WorkDay = wday(Time) %in% 2:6 & !Holiday)

elec_fit_upd <- elec_fit %>% stream(elec_update)
#> Warning: Problem with `mutate()` input `fasster`.
#> x 'poly' is deprecated.
#> Use 'trend' instead.
#> See help("Deprecated")
#> ℹ Input `fasster` is `(function (object, ...) ...`.
#> Warning: 'poly' is deprecated.
#> Use 'trend' instead.
#> See help("Deprecated")
#> Warning: Problem with `mutate()` input `fasster`.
#> x 'trig' is deprecated.
#> Use 'fourier' instead.
#> See help("Deprecated")
#> ℹ Input `fasster` is `(function (object, ...) ...`.
#> Warning: 'trig' is deprecated.
#> Use 'fourier' instead.
#> See help("Deprecated")
elec_fit_upd
#> # A mable: 1 x 1
#>     fasster
#>     <model>
#> 1 <FASSTER>

Created on 2020-08-18 by the reprex package (v0.3.0)