MSSE definition incosistent with Hyndman and Koehler 2005
nickto opened this issue · 6 comments
Summary
It seems that the implementation of msse()
might be potentially inconsistent with other wide-spread definitions of RMSSE, such as by Hyndman and Koehler (2006) or Makridakis et al. (2002). In msse()
, the MSE of the forecast is scaled by the MSE of the naive out-of-sample forecast, while other authors use performance metrics of an in-sample naive forecast in the denominator.
Would not it be more consistent with Hyndman and Koehler (2006) to call the current implementation of msse()
the "relative mean squared error" (RelMSE).
Scaled errors by Hyndman and Koehler (2006)
For the mean average scaled error (MASE), Hyndman and Koehler (2006)
propose [...] scaling the error based on the in-sample MAE from the naive (random walk) forecast method.
And suggest applying similar logic for RMSSE:
If the RMSSE is used, it may be preferable to use the in-sample RMSE from the naive method in the denominator of
$q_t$ .
Scaled errors by Makridakis et al. (2002)
Makridakis et al. (2002) also seem to define RMSSE in a similar way:
$$
RMSSE = \sqrt{
\frac{
\frac{1}{h} \sum_{t=n+1}^{n+h} (y_t - \hat{y}t)^2
}{
\frac{1}{n-1} \sum{t=2}^n (y_t - y_{t-1})^2
}
},
$$
where
Scaled errors in Hierarchical Forecast
Hierarchical Forecast package (and I think Olivares et al. (2023) too) defines msse()
as
where
However, Hyndman and Koehler 2006 also define relative measures and provide an example of the relative MAE (RelMAE):
where
$MAE_b$ denote the MAE from the benchmark method.
Hence, would not it be more correct to call the current implementation of msse()
"relative MSE" rather than "mean squared scaled error"? Or am I missing something?
Hey @nickto,
Thanks for the comment.
I think the RMSSE equation is consistent with the definition of HierarchicalForecast's MSSE because
The MSSE satisfies both being scaled and being a relative metric simultaneously.
Hi @kdgutier ,
I think the inconsistency is only in the period on which the scaling factor is computed.
Hyndman and Koehler (2006) and Makridakis et al. (2002) compute the scaling factor on the training (in-sample) data data only.
However, in HierarchicalForecast the scaling factor is computed on the out-of-sample data:
y_naive = np.repeat(y_train[:,[-1]], horizon, axis=1) # rolling naive forecast for the whole forecasting horizon
norm = mse(y=y, y_hat=y_naive) # compare to true out-of-sample values
If I understand correctly, this metric would have been called "relative MSE", according to Hyndman and Koehler (2006). Relative MSE is the ratio between our model MSE and some benchmark MSE (rolling naive forecast in this case).
In order to make msse() more consistent with Hyndman and Koehler (2006) and Makridakis et al. (2022), I guess it should be something along the lines of
y_naive = y_train[:, :-1] # make 1 step ahead naive forecast of the in-sample values
y_naive_true = y_train[:, 1:] # in-sample true values
norm = mse(y=y_naive_true, y_hat=y_naive) # compare to true in-sample values
If my reasoning is wrong (which can very well be), I would be thankful if you could point out what I have overlooked or misunderstood.
@nickto
Would you suggest that going beyond a forecast of
Is that the difference?
Thanks for the comment,
I will incorporate into the library renaming MSSE into RelMSE.