Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer

This is the origin Pytorch implementation of FPPformer in the following paper: [Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer] (Accepted by IEEE IoT).

It owns two versions. The first version is the default one with channel-independent multivariate forecasting formula. The second version is the one, required by one of the reviewers, with additional cross-variable modules. Two versions are distinguished by the argument '--Cross' in main.py.

Starting from this work, we provide the response letters during the revision. If you have any question, you may first refer to the response letters to seek possible answers.

The Entire Architecture of FPPformer

The overview of our proposed FPPformer is illustrated in Figure 1 and its major enhancement on vanilla TSFT concentrates on addressing the preceding two problems of decoder.

Figure 1. An overview of FPPformer’s hierarchical architecture with two-stage encoder and two-stage decoder. Different from the vanilla one, the encoder owns bottom-up structure while the decoder owns top-down structure. Note that the direction of the propagation flow in decoder is opposite to the vanilla one to highlight the top-down structure. 'DM' in the stages of encoder means 'Diagonal-Masked'.

Requirements

Python 3.8.8
matplotlib == 3.3.4
numpy == 1.20.1
pandas == 1.2.4
scipy == 1.9.0
scikit_learn == 0.24.1
torch == 1.11.0

Dependencies can be installed using the following command:

pip install -r requirements.txt

Data

ETT, ECL, Traffic and weather dataset were acquired at: here. Solar dataset were acquired at: Solar. M4 dataset was acquired at: M4.

Data Preparation

After you acquire raw data of all datasets, please separately place them in corresponding folders at ./FPPformer/data.

We place ETT in the folder ./ETT-data, ECL in the folder ./electricity and weather in the folder ./weather of here (the folder tree in the link is shown as below) into folder ./data and rename them from ./ETT-data,./electricity, ./traffic and ./weather to ./ETT, ./ECL, ./Traffic and./weather respectively. We rename the file of ECL/Traffic from electricity.csv/traffic.csv to ECL.csv/Traffic.csv and rename its last variable from OT/OT to original MT_321/Sensor_861 separately.

The folder tree in https://drive.google.com/drive/folders/1ZOYpTUa82_jCcxIdTmyr0LXQfvaM9vIy?usp=sharing:
|-autoformer
| |-ETT-data
| | |-ETTh1.csv
| | |-ETTh2.csv
| | |-ETTm1.csv
| | |-ETTm2.csv
| |
| |-electricity
| | |-electricity.csv
| |
| |-traffic
| | |-traffic.csv
| |
| |-weather
| | |-weather.csv

We place Solar in the folder ./financial of here (the folder tree in the link is shown as below) into the folder ./data and rename them as ./Solar respectively.

The folder tree in https://drive.google.com/drive/folders/1Gv1MXjLo5bLGep4bsqDyaNMI2oQC9GH2?usp=sharing:
|-dataset
| |-financial
| | |-solar_AL.txt

As for M4 dataset, we place the folders ./Dataset and ./Point Forecasts of M4 (the folder tree in the link is shown as below) into the folder ./data/M4. Moreover, we unzip the file ./Point Forecasts/submission-Naive2.rar to the current directory.

The folder tree in https://drive.google.com/drive/folders/1Gv1MXjLo5bLGep4bsqDyaNMI2oQC9GH2?usp=sharing:
|-M4-methods
| |-Dataset
| | |-Test
| | | |-Daily-test.csv
| | | |-Hourly-test.csv
| | | |-Monthly-test.csv
| | | |-Quarterly-test.csv
| | | |-Weekly-test.csv
| | | |-Yearly-test.csv
| | |-Train
| | | |-Daily-train.csv
| | | |-Hourly-train.csv
| | | |-Monthly-train.csv
| | | |-Quarterly-train.csv
| | | |-Weekly-train.csv
| | | |-Yearly-train.csv
| | |-M4-info.csv
| |-Point Forecasts
| | |-submission-Naive2.rar

Then you will obtain folder tree:

|-data
| |-ECL
| | |-ECL.csv
| |
| |-ETT
| | |-ETTh1.csv
| | |-ETTh2.csv
| | |-ETTm1.csv
| | |-ETTm2.csv
| |
| |-M4
| | |-Dataset
| | | |-Test
| | | | |-Daily-test.csv
| | | | |-Hourly-test.csv
| | | | |-Monthly-test.csv
| | | | |-Quarterly-test.csv
| | | | |-Weekly-test.csv
| | | | |-Yearly-test.csv
| | | |-Train
| | | | |-Daily-train.csv
| | | | |-Hourly-train.csv
| | | | |-Monthly-train.csv
| | | | |-Quarterly-train.csv
| | | | |-Weekly-train.csv
| | | | |-Yearly-train.csv
| | | |-M4-info.csv
| | |-Point Forecasts
| | | |-submission-Naive2.csv
| |
| |-Solar
| | |-solar_AL.txt
| |
| |-Traffic
| | |-Traffic.csv
| |
| |-weather
| | |-weather.csv

Baseline

We select seven typical deep time series forecasting models, i.e., Triformer, Crossformer, Scaleformer, PatchTST, FiLM and TSMixer as baselines in multivariate/univariate forecasting experiments. Their source codes origins are given below:

Baseline	Source Code
Triformer	https://github.com/razvanc92/triformer
Crossformer	https://github.com/Thinklab-SJTU/Crossformer
Scaleformer	https://github.com/BorealisAI/scaleformer
PatchTST	https://github.com/yuqinie98/PatchTST
FiLM	https://github.com/tianzhou2011/FiLM
TSMixer	https://github.com/google-research/google-research/tree/master/tsmixer

Moreover, the default experiment settings/parameters of aforementioned seven baselines are given below respectively:

Baselines	Settings/Parameters name	Descriptions	Default mechanisms/values
Triformer	num_nodes	The number of nodes	4
	patch_sizes	The patch size	4
	d_model	The number of hidden dimensions	32
	mem_dim	The dimension of memory vector	5
	e_layers	The number of encoder layers	2
	d_layers	The number of decoder layers	1
Crossformer	seq_len	Segment length (L_seq)	6
	d_model	The number of hidden dimensions	64
	d_ff	Dimension of fcn	128
	n_heads	The number of heads in multi-head attention mechanism	2
	e_layers	The number of encoder layers	2
Scaleformer	Basic model	The basic model	FEDformer-f
	scales	Scales in multi-scale	[16, 8, 4, 2, 1]
	scale_factor	Scale factor for upsample	2
	mode_select	The mode selection method	random
	modes	The number of modes	2
	L	Ignore level	3
PatchTST	patch_len	Patch length	16
	stride	The stride length	8
	n_head	The number of heads in multi-head attention mechanism	4
	d_model	The hidden feature dimension	16
	d_ff	Dimension of fcn	128
FiLM	d_model	The number of hidden dimensions	512
	d_ff	Dimension of fcn	2048
	n_heads	The number of heads in multi-head attention mechanism	8
	e_layers	The number of encoder layers	2
	d_layers	The number of decoder layers	1
	modes1	The number of Fourier modes to multiply	32
TSMixer	n_block	The number of block for deep architecture	2
TSMixer	d_model	The hidden feature dimension	64

Usage

Commands for training and testing FPPformer of all datasets are in ./scripts/Main.sh.

More parameter information please refer to main.py.

We provide a complete command for training and testing FPPformer:

For multivariate forecasting:

python -u main.py --data <data> --features <features> --input_len <input_len> --pred_len <pred_len> --encoder_layer <encoder_layer> --patch_size <patch_size> --d_model <d_model> --Cross <Cross> --learning_rate <learning_rate> --dropout <dropout> --batch_size <batch_size> --train_epochs <train_epochs> --patience <patience> --itr <itr> --train

For univariate forecasting:

python -u main_M4.py --data <data> --freq <freq> --input_len <input_len> --pred_len <pred_len> --encoder_layer <encoder_layer> --patch_size <patch_size> --d_model <d_model> --learning_rate <learning_rate> --dropout <dropout> --batch_size <batch_size> --train_epochs <train_epochs> --patience <patience> --itr <itr> --train

Here we provide a more detailed and complete command description for training and testing the model:

Parameter name	Description of parameter
data	The dataset name
root_path	The root path of the data file
data_path	The data file name
features	The forecasting task. This can be set to `M`,`S` (M : multivariate forecasting, S : univariate forecasting
target	Target feature in `S` task
freq	Sampling frequency for M4 sub-datasets
checkpoints	Location of model checkpoints
input_len	Input sequence length
pred_len	Prediction sequence length
enc_in	Input size
dec_out	Output size
d_model	Dimension of model
representation	Representation dims in the end of the intra-reconstruction phase
dropout	Dropout
encoder_layer	The number of encoder layers
patch_size	The size of each patch
Cross	Whether to use cross-variable attention
itr	Experiments times
train_epochs	Train epochs of the second stage
batch_size	The batch size of training input data in the second stage
patience	Early stopping patience
learning_rate	Optimizer learning rate

Results

The experiment parameters of each data set are formated in the Main.sh files in the directory ./scripts/. You can refer to these parameters for experiments, and you can also adjust the parameters to obtain better mse and mae results or draw better prediction figures. We provide the commands for obtain the results of FPPformer-Cross in the file ./scripts/Cross.sh, those of FPPformer with longer input sequence lengths in the file ./scripts/LongInput.sh, those of FPPformer with different encoder layers in the file ./scripts/ParaSen.sh.

Figure 2. Multivariate forecasting results

Figure 3. Univariate forecasting results

Full results

Moreover, we present the full results of multivariate forecasting results with long input sequence lengths in Figure 4, that of ablation study in Figure 5 and that of parameter sensitivity in Figure 6.

Figure 4. Multivariate forecasting results with long input lengths

Figure 5. Ablation results with the prediction length of 720

Figure 6. Results of parameter sensitivity on stage numbers

Contact

If you have any questions, feel free to contact Li Shen through Email (shenli@buaa.edu.cn) or Github issues. Pull requests are highly welcomed!

yi-zhi111/FPPformer_IEEEIoT_2023