EnyanDai/GANF

The dataset in train_water.py

Closed this issue · 1 comments

Hi, the dataset of train GANF in train_water.py is SWaT_Dataset_Attack_v0.csv. When running train_water.py, SWaT_Dataset_Attack_v0.csv is splitted train/val/test dataloader. I can't understand why this model was trained in SWaT_Dataset_Attack_v0.csv. I think this model is more reasonable to train on SWaT_Dataset_Normal_v1.csv that is not attacked, and to test on SWaT_Dataset_Attack_v0.csv. I think this training method will make the attacked points more likely to be located in areas of low probability density. Thank you very much!

Hi
In the real-world applications, the anomalies are generally mixed with the normal points. Thus, it is more realistic to utilize the dataset that contain both normal points and a small fraction of anomalies as the training set.