Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting (NeurIPS 2019)
Able to match the results of the paper for the synthetic dataset as shown in the table below
The synthetic dataset was constructed as shown below
A nice visualization of how the attention layers look at the signal for predicting the last timestep t=t0+24-1