antoinecarme/pyaf

Investigate PyTorch-based LSTM and MLP models

antoinecarme opened this issue · 12 comments

PyAF uses Google Keras/Tensorflow to implement LSTM and MLP models.

It is interesting to be able to use Facebook PyTorch when available. PyTorch is more widely available, and more "open-source".

https://pytorch.org/

No GPU/TPU support is needed. PyAF does not need That much computing power.

Impact : Keras/Tensorflow will be used only when no PyTroch is available. Impacts only LSTM and MLP models which are not enabled by default.

Easy to fix.

Target Release : 2022-07-14

Need to be able to make models reproducible (parameter/optimizer choices, determinism, random seeds)

logs/pytorch_test_ozone_exogenous_LSTMX_keras.log-INFO:pyaf.std:CYCLE_DETAIL '_Ozone_LSTM_Keras_LinearTrend_residue_Seasonal_MonthOfYear' [Seasonal_MonthOfYear]
logs/pytorch_test_ozone_exogenous_LSTMX_keras.log-INFO:pyaf.std:AUTOREG_DETAIL '_Ozone_LSTM_Keras_LinearTrend_residue_Seasonal_MonthOfYear_residue_LSTM(51)' [LSTM(51)]
logs/pytorch_test_ozone_exogenous_LSTMX_keras.log:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.0472 MAPE_Forecast=0.178 MAPE_Test=0.2098
logs/pytorch_test_ozone_exogenous_LSTMX_keras.log-INFO:pyaf.std:MODEL_SMAPE SMAPE_Fit=0.0472 SMAPE_Forecast=0.1967 SMAPE_Test=0.2505
logs/pytorch_test_ozone_exogenous_LSTMX_keras.log-INFO:pyaf.std:MODEL_MASE MASE_Fit=0.1958 MASE_Forecast=0.6944 MASE_Test=1.0433

image

logs/pytorch_test_ozone_exogenous_LSTMX_pytorch.log-INFO:pyaf.std:CYCLE_DETAIL '_Ozone_LSTM_PyTorch_LinearTrend_residue_bestCycle_byMAPE' [Cycle_None]
logs/pytorch_test_ozone_exogenous_LSTMX_pytorch.log-INFO:pyaf.std:AUTOREG_DETAIL '_Ozone_LSTM_PyTorch_LinearTrend_residue_bestCycle_byMAPE_residue_LSTM(51)' [LSTM(51)]
logs/pytorch_test_ozone_exogenous_LSTMX_pytorch.log:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1985 MAPE_Forecast=0.158 MAPE_Test=0.1456
logs/pytorch_test_ozone_exogenous_LSTMX_pytorch.log-INFO:pyaf.std:MODEL_SMAPE SMAPE_Fit=0.1896 SMAPE_Forecast=0.1701 SMAPE_Test=0.1438
logs/pytorch_test_ozone_exogenous_LSTMX_pytorch.log-INFO:pyaf.std:MODEL_MASE MASE_Fit=0.8683 MASE_Forecast=0.6579 MASE_Test=0.7264

image

logs/pytorch_test_ozone_exogenous_MLPX_keras.log-INFO:pyaf.std:CYCLE_DETAIL '_Ozone_MLP_Keras_LinearTrend_residue_bestCycle_byMAPE' [Cycle_None]
logs/pytorch_test_ozone_exogenous_MLPX_keras.log-INFO:pyaf.std:AUTOREG_DETAIL '_Ozone_MLP_Keras_LinearTrend_residue_bestCycle_byMAPE_residue_MLP(51)' [MLP(51)]
logs/pytorch_test_ozone_exogenous_MLPX_keras.log:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1645 MAPE_Forecast=0.1277 MAPE_Test=0.1618
logs/pytorch_test_ozone_exogenous_MLPX_keras.log-INFO:pyaf.std:MODEL_SMAPE SMAPE_Fit=0.1553 SMAPE_Forecast=0.1383 SMAPE_Test=0.1641
logs/pytorch_test_ozone_exogenous_MLPX_keras.log-INFO:pyaf.std:MODEL_MASE MASE_Fit=0.7073 MASE_Forecast=0.5761 MASE_Test=0.7713

image

logs/pytorch_test_ozone_exogenous_MLPX_pytorch.log-INFO:pyaf.std:CYCLE_DETAIL '_Ozone_MLP_PyTorch_ConstantTrend_residue_zeroCycle[0.0]' [NoCycle]
logs/pytorch_test_ozone_exogenous_MLPX_pytorch.log-INFO:pyaf.std:AUTOREG_DETAIL '_Ozone_MLP_PyTorch_ConstantTrend_residue_zeroCycle[0.0]_residue_MLP(51)' [MLP(51)]
logs/pytorch_test_ozone_exogenous_MLPX_pytorch.log:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1881 MAPE_Forecast=0.1657 MAPE_Test=0.3164
logs/pytorch_test_ozone_exogenous_MLPX_pytorch.log-INFO:pyaf.std:MODEL_SMAPE SMAPE_Fit=0.1884 SMAPE_Forecast=0.1489 SMAPE_Test=0.2636
logs/pytorch_test_ozone_exogenous_MLPX_pytorch.log-INFO:pyaf.std:MODEL_MASE MASE_Fit=0.8738 MASE_Forecast=0.6175 MASE_Test=1.5465

image

PyAF multithreading experience.

System used :

https://github.com/antoinecarme/xeon-phi-data

Keras vs Pytorch on a 256 threads machine on the same PyAF model. PyAF in slow mode (all models enabled, including scikit-learn, xgboost, lightgbm and (keras or pytorch), ...)

Using taskset -c 0-45 to assign the threads 0-45 for keras python script while assigning 100-145 for pytorch python script.

image

Reproducibility tests:

We ran the same model 5 times , with keras and pytorch backends. We give here the MAPE of each of the 5 models, and the training times.

python3 tests/pytorch/test_ozone_exogenous_MLPX_keras.py

a1:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1859 MAPE_Forecast=0.1516 MAPE_Test=0.1823
a2:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1859 MAPE_Forecast=0.1516 MAPE_Test=0.1823
a3:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1859 MAPE_Forecast=0.1516 MAPE_Test=0.1823
a4:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1859 MAPE_Forecast=0.1516 MAPE_Test=0.1823
a5:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1859 MAPE_Forecast=0.1516 MAPE_Test=0.1823

Training Times :

a1:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 200.244, 
a2:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 201.606, 
a3:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 202.996, 
a4:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 203.699, 
a5:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 202.158, 

python3 tests/pytorch/test_ozone_exogenous_MLPX_pytorch.py

b1:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1993 MAPE_Forecast=0.1773 MAPE_Test=0.1195
b2:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1993 MAPE_Forecast=0.1773 MAPE_Test=0.1195
b3:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1993 MAPE_Forecast=0.1773 MAPE_Test=0.1195
b4:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1993 MAPE_Forecast=0.1773 MAPE_Test=0.1195
b5:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1993 MAPE_Forecast=0.1773 MAPE_Test=0.1195

Training Times :

b1:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 38.01, 
b2:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 35.522,
b3:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 35.258,
b4:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 35.254,
b5:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 35.355,


tests/pytorch/test_ozone_exogenous_LSTMX_keras.py
d1:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.0703 MAPE_Forecast=0.1908 MAPE_Test=0.5673
d2:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.0703 MAPE_Forecast=0.1908 MAPE_Test=0.5673
d3:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.0703 MAPE_Forecast=0.1908 MAPE_Test=0.5673
d4:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.0703 MAPE_Forecast=0.1908 MAPE_Test=0.5673
d5:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.0703 MAPE_Forecast=0.1908 MAPE_Test=0.5673


Training Times :

d1:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 1275.13
d2:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 1276.12
d3:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 1284.37
d4:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 1280.74
d5:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 1268.05


python3 tests/pytorch/test_ozone_exogenous_LSTMX_pytorch.py

c1:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1768 MAPE_Forecast=0.1655 MAPE_Test=0.218
c2:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1768 MAPE_Forecast=0.1655 MAPE_Test=0.218
c3:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1768 MAPE_Forecast=0.1655 MAPE_Test=0.218
c4:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1768 MAPE_Forecast=0.1655 MAPE_Test=0.218
c5:INFO:pyaf.std:MODEL_MAPE MAPE_Fit=0.1768 MAPE_Forecast=0.1655 MAPE_Test=0.218

Training Times :

c1:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 82.162,
c2:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 86.103,
c3:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 81.996,
c4:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 81.996,
c5:INFO:pyaf.timing:('OPERATION_END_ELAPSED', 86.413,


TODO : PyAF+Keras is very slow. PyAF implementation of keras models can be improved.

Model summary for pytorch :

INFO:pyaf.std:AR_MODEL_DETAIL_START
INFO:pyaf.std:MODEL_TYPE PYTORCH
INFO:pyaf.std:PYTORCH_MODEL_ARCHITECTURE [Sequential(
  (0): Linear(in_features=51, out_features=51, bias=True)
  (1): Dropout(p=0.5, inplace=False)
  (2): Linear(in_features=51, out_features=1, bias=True)
)]
INFO:pyaf.std:AR_MODEL_DETAIL_END

Model summary for keras :

INFO:pyaf.std:AR_MODEL_DETAIL_START
INFO:pyaf.std:MODEL_TYPE KERAS
INFO:pyaf.std:KERAS_MODEL_ARCHITECTURE {"class_name": "Sequential", "config": {"name": "PyAF_cMLP_Model", "layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 51], "dtype": "float64", "sparse": false, "ragged": false, "name": "dense_14_input"}}, {"class_name": "Dense", "config": {"name": "dense_14", "trainable": true, "batch_input_shape": [null, 51], "dtype": "float64", "units": 51, "activation": "linear", "use_bias": true, "kernel_initializer": {"class_name": "GlorotUniform", "config": {"seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "Dropout", "config": {"name": "dropout_7", "trainable": true, "dtype": "float64", "rate": 0.1, "noise_shape": null, "seed": null}}, {"class_name": "Dense", "config": {"name": "dense_15", "trainable": true, "dtype": "float64", "units": 1, "activation": "linear", "use_bias": true, "kernel_initializer": {"class_name": "GlorotUniform", "config": {"seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}]}, "keras_version": "2.8.0", "backend": "tensorflow"}
INFO:pyaf.std:AR_MODEL_DETAIL_END

FIXED