Improving Time Series Encoding with Noise-Aware Self-Supervised Learning and an Efficient Encoder.

Requirements

The requirements.txt file are attached for list of packages required.

The dependencies can be installed with this single-line command:

pip install -r requirements.txt

The datasets are all publicly available online, put into data/ folder in the following way:

cd ..
mkdir data/
cd data

128 UCR datasets: After downloading and unzip-ing the compressed file, rename the folder to UCR.
30 UEA datasets: After downloading and unzip-ing the compressed file, rename the folder to UEA.
3 ETT datasets: Download 3 files ETTh1.csv, ETTh2.csv and ETTm1.csv.
Electricity dataset: After downloading and unzip-ing the compressed file, run preprocessing file at CoInception/preprocessing/preprocess_electricity.py and placed at ../data/electricity.csv.
Yahoo dataset: First register for using the dataset, then downloading and unzip-ing the compressed file, run preprocessing file at CoInception/preprocessing/preprocess_yahoo.py and placed at ../data/yahoo.pkl.
KPI dataset: After downloading and unzip-ing the compressed file, run preprocessing file at CoInception/preprocessing/preprocess_kpi.py and placed at ../data/kpi.pkl.

Run this one-line command for both training and evaluation:

python train.py <dataset_name> <run_name> --loader <loader> --batch-size <batch_size> --repr-dims <repr_dims> --gpu <gpu> --eval --save_ckpt

Example:

python -u train.py Chinatown UCR --loader UCR --batch-size 8 --repr-dims 320 --max-threads 8 --seed 42 --eval

The detailed descriptions about the arguments are as following:

Parameter name	Description
dataset_name (required)	The dataset name
run_name (required)	The folder name used to save model, output and evaluation metrics. This can be set to any word
loader	The data loader used to load the experimental data. This can be set to `UCR`, `UEA`, `forecast_csv`, `forecast_csv_univar`, `anomaly`, or `anomaly_coldstart`
batch_size	The batch size (defaults to 8)
repr_dims	The representation dimensions (defaults to 320)
gpu	The gpu no. used for training and inference (defaults to 0)
eval	Whether to perform evaluation after training
save_ckpt	Whether to save checkpoint (default: False)

(For descriptions of more arguments, run python train.py -h.)

Scripts: The scripts for reproduction are provided in scripts/ folder.

This codebase is partially inherited from these below repositories, we want to express our thank to the authors:

TS2Vec: TS2Vec: Towards Universal Representation of Time Series (AAAI-22)
TNC: Unsupervised Representation Learning for TimeSeries with Temporal Neighborhood Coding (ICLR 2021)
T-Loss: Unsupervised Scalable Representation Learning for Multivariate Time Series (NeurIPS 2019)