/Deep-Packet

Pytorch implementation of deep packet: a novel approach for encrypted traffic classification using deep learning

Primary LanguageJupyter NotebookMIT LicenseMIT

Deep Packet

Details in blog post: https://blog.munhou.com/2020/04/05/Pytorch-Implementation-of-Deep-Packet-A-Novel-Approach-For-Encrypted-Tra%EF%AC%83c-Classi%EF%AC%81cation-Using-Deep-Learning/

Changelog

EDIT: 2022-11-30

  • Add the ResNet model. Kudos to Taehyun for implementing ResNet.

EDIT: 2022-09-27

  • Update dataset and model
  • Update dependencies
  • Add more data to chat, file_transfer, voip, streaming and vpn_voip
  • Remove tor and torrent related data as they are no longer available

EDIT: 2022-01-18

  • Update dataset and model

EDIT: 2022-01-17

  • Update code and model
  • Drop petastorm, use huggingface's datasets instead for data loader

How to Use

  • Clone the project
  • Create environment via conda
    • For Mac
      conda env create -f env_mac.yaml
    • For Linux (CPU only)
      conda env create -f env_linux_cpu.yaml
    • For Linux (CUDA 10.2)
      conda env create -f env_linux_cuda102.yaml
    • For Linux (CUDA 11.3)
      conda env create -f env_linux_cuda113.yaml
  • Download the train and test set I created at here, or download the full dataset if you want to process the data from scratch.

Data Pre-processing

python preprocessing.py -s /path/to/CompletePcap/ -t processed_data

Create Train and Test

python create_train_test_set.py -s processed_data -t train_test_data

Train Model

Application Classification

For CNN model

python train_cnn.py -d train_test_data/application_classification/train.parquet -m model/application_classification.cnn.model -t app

For Resnet model

python train_resnet.py -d train_test_data/application_classification/train.parquet -m model/application_classification.cnn.model -t app

Traffic Classification

For CNN model

python train_cnn.py -d train_test_data/traffic_classification/train.parquet -m model/traffic_classification.cnn.model -t traffic

For Resnet model

python train_resnet.py -d train_test_data/traffic_classification/train.parquet -m model/traffic_classification.cnn.model -t traffic

Evaluation Result (CNN)

Application Classification

Traffic Classification

Model Files

Download the pre-trained CNN models here.

Elapsed Time

Preprocessing

Code ran on AWS c5.4xlarge

7:01:32 elapsed

Train and Test Creation

Code ran on AWS c5.4xlarge

2:55:46 elapsed

Traffic Classification Model Training (CNN)

Code ran on AWS g5.xlarge

24:41 elapsed

Application Classification Model Training (CNN)

Code ran on AWS g5.xlarge

7:55 elapsed