Need help on reproducing the f1-scores on SWAT dataset
Closed this issue · 1 comments
Hi @zhhlee, thank you so much for this codebase, and your earlier help in making the code working on the SWAT dataset. Currently, I am trying to reproduce the results from your paper for the SWAT dataset, but I am facing some problems. Hope you can help!
I have run the training and prediction twice (where prediction has --mcmc_track=False') with the given flags provided by you for the SWAT dataset. The first time the best f1-score is 0.844. The second time the best f1-score is 0.864. In the paper, you mentioned that the best f1-score you achieved is 0.928. I am currently also trying to run prediction with mcmc_tracker=True. But it would take around two days for me to get results, where I can get results in 8 hours when mcmc_tracker=False.
Can you please help to check whether the flags or hyperparameters are set correctly in your GitHub code??? Many thanks for your help and awesome work!!!
I have done the following for the data processing:
1, Download SWAT datasets.
2, Using xlsx2csv to convert SWaT_Dataset_Attack_v0.xlsx to SWAT_Dataset_Attack_v0.csv. Same for SWAT_Dataset_Normal_v0.xlsx.
3, Use explib/raw_data_converter
to convert the respective csv files to pkl files.
I use the following command for training as you suggested:
python stack_train.py --dataset=SWaT --train.train_start=21600 --train.valid_portion=0.1 --model.window_length=30 '--model.output_shape=[15, 15, 30]' --model.z2_dim=8 --output-dir=/tmp/output/interfusion/SWAT/train_1
I use the following command for prediction as you suggested:
python stack_predict.py --load_model_dir=/tmp/output/interfusion/SWAT/train_1 --output-dir=/tmp/output/interfusion/SWAT/pred_1 --mcmc_track=False
- It seems that the code are all right, for SWaT dataset, some hyperparameters are set in ExpConfig (monitored by mltk package). You may check the config.json file generated in output_dir to see if they are set correctly.
- The model training may be not very stable on some of the datasets, mainly due to the training of the flexible RNVP posterior. This may lead to a slight performance degradation when the model is not well trained sometimes.
- For SWaT dataset, we use mcmc_tracker=False and use_mcmc=True for testing. By the way, to reduce the testing time, you may also set plot_recons_results=False and save_results=False when testing.
- To help you reproduce the results, we put our model for SWaT at https://1drv.ms/f/s!AsTNHlSUTQHXg3s-uVG9HQ23BO28, which achieves 0.928 best f1-score (with around 0.003 fluctuation due to the stochasticity in MCMC imputation in inference phase).