Modeling multivariate time series has been a subject for a long time, which attracts the attention of scholars from many fields including economics, finance, traffic, etc. As the number of models increases, it's significant to design a unified framework to implement and evaluate these models.
MvTS
is a systematic, comprehensive, extensible, and easy-to-use multivariate time series forecasting library. It divide the whole process into four parts: data process, model implement, train and per-
formance evaluate, and designs four modules: DataLoader, Model, Executor and Evaluator, to complete the four processes respectively, with one Config to complete the loading of parameters, and one Pipeline to pass messages between modules.
We believe MvTS
will contribute to the research of Multivariate Time Series Forecasting.
It's generally accepted that multivariate time series forecasting problems can be divided into single- step forecasting and multi-step forecasting.
The time series with
The goal of single-step forecasting is to obtain the future value
The goal of multi-step forecasting is to predict a sequence of future values
Put the h5 File (Described in Dataset) into ./MTS-Library/mvts/raw_data/
, and execute following corresponding commands according to different prediction tasks.
cd ./MTS-Librray
Python run_model.py --task single_step --model ××× --dataset ×××
cd ./MTS-Library
Python run_model.py --task multi_step --model ××× --dataset ×××
MvTS
records the rawdata, adjacency matrix and time information of each dataset and integrates it into the h5 file. To get the above three types of information, users can execute the following commands.
file = "./raw_data/" + filename
f = h5py.File(file, "r")
RawData = np.array(f["raw_data"])
file = "./raw_data/" + filename
f = h5py.File(file, "r")
adj = np.array(f["adjacency_matrix"]) ## For datasets whose adjacency matrix is not available, MvTS will set it zero-matrix, meaning that this matrix is meaningless.
file = "./raw_data/" + filename
f = h5py.File(file, "r")
time = np.array(f["time"])
t = []
for i in range(time.shape[0]):
t.append(time[i].decode()) # It was handled by str.encode() before integrated in the h5 file.
time = np.stack(t, axis=0)
time = pd.to_datetime(time)
The following are the information of datasets supported by MvTS
. Users can download the datasets from Google Drive or Baidu Yun.
Click here to get the original data which is unprocessed.
Datasets | Nodes | TimeSteps | Granularity | StartTime |
---|---|---|---|---|
AirQualityUCI (link) | 12 | 9357 | 1hour | 03/10/2004 |
covid-19 (paper) | 284 | 816 | 1day | 01/22/2020 |
ECG (paper) | 140 | 5000 | - | - |
electricity (paper) | 321 | 26304 | 1hour | 07/01/2016 |
ETTh1 (paper) | 7 | 17420 | 1hour | 07/01/2016 |
ETTh2 (paper) | 7 | 17420 | 1hour | 07/01/2016 |
ETTm1 (paper) | 7 | 69680 | 15min | 07/01/2016 |
ETTm2 (paper) | 7 | 69680 | 15min | 07/01/2016 |
exchange-rate (paper) | 8 | 7588 | 1day | 01/01/1990 |
illness (paper) | 7 | 966 | 7day | 01/01/2002 |
metr-la (paper) | 207 | 34272 | 5min | 03/01/2012 |
nyc-bike (paper) | 250 | 4368 | 30min | 04/01/2016 |
nyc-taxi (paper) | 266 | 4368 | 30min | 04/01/2016 |
pems-bay (paper) | 325 | 52116 | 5min | 01/01/2017 |
PEMS03 (paper) | 358 | 26208 | 5min | 05/01/2012 |
PEMS04 (paper) | 307 | 16992 | 5min | 07/01/2017 |
PEMS07 (paper) | 883 | 28224 | 5min | 05/01/2017 |
PEMS08 (paper) | 170 | 17856 | 5min | 07/01/2016 |
solar (paper) | 137 | 52560 | 10min | 01/01/2006 |
traffic (paper) | 862 | 17544 | 1hour | 07/01/2016 |
weather (paper) | 21 | 52696 | 10min | 01/01/2020 |
WTH (paper) | 12 | 35064 | 1hour | 01/01/2010 |
wind (paper) | 28 | 10957 | 1day | 01/01/1986 |
If you want to develop new dataset into MvTS
, you need to organize the new dataset in the same format and store it in h5 file. Among them, the processing of time information can refer to the following method. (To be convenient, we use the common used csv file for an example.)
def handle_csv(filename, outname):
df = pd.read_csv(filename)
columns = df.columns
# data = df[columns[1:]]
# data = np.array(data)
time = df[columns[0]]
time = pd.to_datetime(time)
mid = np.array(time.values)
res = []
for i in range(mid.shape[0]):
res.append(str(mid[i]).encode())
time = np.stack(res, axis=0) ### result
# print(time.shape)
# print(time)
If the time information is not available directly for some reason. Then users can generate one as the following way.
#parameters
"timeList_gene": {
"unit": "m",
"origin": "2018-01-01",
"time_step": 15
},
"//this is notes for timeList gene" : {
"unit": "the time unit. d: daily; h: hour; m: minute; s:second; ms,us, ns...",
"origin": "the base-time for gene",
"time_step": "the Length of each time hop"
}
def geneTimeList(self):
time = []
for i in range(self.seq_len):
time.append(i * self.timeList_gene["time_step"]) #parameters
res = pd.to_datetime(
time,
unit=self.timeList_gene["unit"],
origin=self.timeList_gene["origin"]
)
################time##################
time = res
mid = np.array(time.values)
res = []
for i in range(mid.shape[0]):
res.append(str(mid[i]).encode())
time = np.stack(res, axis=0) ### result
#### result
Currently, the models supported by MvTS
are as follows, and we'll continuously develop new models into it.
- LSTNET (Lai, Guokun, 2017) (paper)
- TPA-LSTM (Shih, Shun-Yao, 2018) (paper)
- StemGNN (Cao, Defu, 2021) (paper)
- MTGNN (Wu, Zonghan, 2020) (paper)
- ESG (Ye, Junchen, 2022) (paper)
- DARNN (Qin, Yao, 2017) (paper)
- AutoFormer (Wu, Haixu, 2021) (paper)
- BHTARIMA (Shi, Qiquan, 2020) (paper)
- AdaRNN (Du, Yuntao, 2021) (paper)
- DCRNN (Li, Yaguang, 2017) (paper)
- STGCN (Bing Yu, 2017) (paper)
- GWNET (Zonghan Wu, 2019) (paper)
- STSGCN (Chao Song, 2020) (paper)
- STFGNN (Li, Mengzhang, 2020) (paper)
- DGCRN (Li, Fuxian, 2021) (paper)
- StemGNN (Cao, Defu, 2021) (paper)
- GTS (Shang, Chao, 2021) (paper)
- CCRNN (Ye, Junchen, 2020) (paper)
- MTGNN (Wu, Zonghan, 2020) (paper)
- AGCRN (Bai, Lei, 2020) (paper)
- ESG (Ye, Junchen, 2022) (paper)
- STGNN (Wang, Xiaoyang, 2020) (paper)
- STODE (Zheng Fang, 2021) (paper)
- ASTGCN (Shengnan Guo, 2019) (paper)
- GMAN (Zheng, Chunpan, 2019) (paper)
- Informer (Zhou, Haoyi, 2020) (paper)
- AutoFormer (Wu, Haixu, 2021) (paper)
- SAAM (Moreno-Pino, 2021) (paper)
- SCINET (Liu, Minhao, 2021) (paper)
- NBeats (Oreshkin, Boris N. , 2019) (paper)
- FC-GAGA (Oreshikin Boris N., 2020) (paper)
- ST-Norm (Jinliang Deng, 2021) (paper)
- DeepAR (Salinas, 2017) (paper)
In the process of building MvTS, we referred to some classic code libraries to complete the construction of the MvTS framework. We have listed the referenced code libraries below, and we greatly appreciate the help of these code libraries and their designers.
MvTS will continue to learn various excellent code libraries to improve its design. Once again, we would like to thank so many relevant scholars for their help.
It's our great honor if you are interest in MvTS
, and any modification are welcome to contribute to the development of MvTS
, for example, new datasets, new models and bugs, etc....
We have created a guide to help you understand the workflow of MvTS
and develop the library. You can read it here.
For any questions, you are welcome to contact us via lwm568@buaa.edu.cn