This project has been conducted under the supervision of Dr. Jinoh Kim and Dr. Donghwoon Kwon at Texas A&M University-Commerce. The research outcome are published in the proceeding of IEEE ICNC 2018 (http://www.conf-icnc.org/2018/), with the title of “An Empirical Evaluation of Deep Learning for Network Anomaly Detection”.
Jupyter NotebookMIT
Network Anomaly detection on datasets NSL-KDD, Kyoto University and Mawii labs
This project has been conducted under the supervision of Dr. Jinoh Kim and Dr. Donghwoon Kwon at Texas A&M University-Commerce. The research outcome will be published in the proceeding of IEEE ICNC 2018, with the title of “An Empirical Evaluation of Deep Learning for Network Anomaly Detection”.
Below results are for NSL-KDD Dataset only. Master branch contains code for NSL-KDD dataset. There are separate dev branches for Kyoto University and Mawii labs. The networks implemented are same for all datasets.
Exploratory Data Analysis
Andrew Curves (High dimensional data plots)
T-SNE (Data dimensionality Reduction)
Pattern evolving during epochs
Pattern in final (4000) epoch
Results of Train/Test cycles
Fully Connected Neural Network
Accuracy
F1 Score
Precision
Recall
Model
Scenarios
Number of Features
Fully Connected
Train+_Test+
48
0.8670
0.8739
0.9490
0.8098
Train+_Test-
48
0.7576
0.8350
0.9424
0.7495
Train-_Test+
48
0.8561
0.8695
0.8988
0.8420
Train-_Test-
48
0.7504
0.8396
0.8856
0.7981
Variational Autoencoder
latent variables used for prediction
Accuracy
F1 Score
Precision
Recall
Model
Scenarios
Number of Features
VAE-Softmax
Train+_Test+
122
0.8948
0.9036
0.9441
0.8665
Train+_Test-
122
0.8173
0.8814
0.9402
0.8296
Train-_Test+
48
0.7195
0.6942
0.9151
0.5592
Train-_Test-
48
0.8015
0.8700
0.9373
0.8118
Variational Autoencoder
Anomaly labels treated as part of actual data
Network learns to regenerated labels treating it as missing data during testing.
Accuracy
F1 Score
Precision
Recall
Model
Scenarios
Number of Features
VAE-GenerateLabels
Train+_Test+
1
0.5692
0.7255
0.5692
1.0
Train+_Test-
1
0.8184
0.9001
0.8184
1.0
Train-_Test+
1
0.5692
0.7255
0.5692
1.0
Train-_Test-
1
0.8184
0.9001
0.8184
1.0
LSTM Seq2Seq
Softmax layer is used to convert output sequence to Normal/Anomaly prediction.