/dbn-based-nids

An Intrusion Detection System based on Deep Belief Networks

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

DBN-based NIDS on the CICIDS2017 Dataset

python pip PWC

Table of contents

Introduction

This is the source code for the paper entitled "An Intrusion Detection System based on Deep Belief Networks" accepted in the 4th International Conference on Science of Cyber Security (SciSec).

In this repository, we propose a multi-class classification NIDS based on Deep Belief Networks (DBNs). DBN is a generative graphical model formed by stacking multiple Restricted Boltzmann Machines (RBMs). It can identify and learn high-dimensional representations due to its deep architecture. We conducted multiple experiments using the CICIDS2017 dataset [1] with various class-balancing techniques.

Installation

  • If you want to run the scripts, first ensure you have python globally installed in your computer. If not, you can get python here.

  • Then, clone the repo to your PC and change the branch:

        $ git clone https://github.com/othmbela/dbn-based-nids.git
  • Dependencies

    1. Cd into your the cloned repository as such:
          $ cd dbn-based-nids
    2. Initialise the project as such:
          $ make init

    First, the command line will create your virtual environment and install the dependencies needed to run the app. Then, it will create the data folders.

Data Preparation

  • Download the dataset from here.
  • Move the CSV files to the following directory ./data/raw/
  • Afterwards, the dataset must be pre-processed using this following command line:
    $ make dataset

It will generate multiple pickle files that will we use to train and evaluate our models. More details about the pre-processing can be found here.

Usage

Once the data is ready to be used, you can train the models using configs files. Config files are in .json format:

    {
        "name": "deep_belief_network",
        "model": {                                       
            "type": "DBN",
            "args": {                                       // model parameters
                "n_visible": 49,
                "n_hidden": [128, 256, 128, 128, 64],
                "n_classes": 6,
                "learning_rate": [0.1, 0.1, 0.1, 0.1, 0.1],
                "momentum": [0.9, 0.9, 0.9, 0.9, 0.9],
                "decay": [0, 0, 0, 0, 0],
                "batch_size": [64, 64, 64, 64, 64],
                "num_epochs": [10, 10, 10, 10, 10],
                "k": [1, 1, 1, 1, 1]
            }
        },
        "data_loader": {
            "type": "InstacartDataLoader",                  // selecting data loader
            "args": {
                "batch_size": 128                           // batch size
            }
        },
        "optimizer": {
            "type": "Adam",
            "args": {
                "lr": 0.001,                                // learning rate
                "weight_decay": 0,                          // weight decay
                "amsgrad": false,
                "balanced": false
            }
        },
        "loss": {
            "type": "CrossEntropyLoss",                     // loss function
            "args": {
                "reduction": "mean"
            }
        },
        "trainer": {
            "num_epochs": 30                                // number of training epochs
        }
    }

Additional configurations can be added in the future, currently to start our DBN and MLP scripts please follow these simple commmands:

    # train the deep belief network
    $ python main.py --config ./configs/deepBeliefNetwork.json

    # train the multi-layer perceptron
    $ python main.py --config ./configs/multilayerPerceptron.json

Files and Folders structure

    ├── checkpoints/                                        # store the trained models as *.pt file.
    │
    ├── configs/
    │
    ├── data/                                               # default directory for storing input data.
    │   ├── processed                                       # final data for modelling.
    │   └── raw                                             # original data.
    │
    ├── images/                                             # store images
    │
    ├── logger/                                             # setup the logger using logger_config.json
    ├── logs/                                               # store *.logs
    │
    ├── models/                                             # pytorch models.
    │   ├── __init__.py
    │   ├── DBN.py
    │   ├── MLP.py
    │   └── RBM.py
    │
    ├── notebooks/                                          # jupyter notebooks.
    │
    ├── preprocessing/                                      # scripts for preprocessing the dataset.
    │
    ├── utils/
    │   ├── __init__.py
    │   ├── datasets.py
    │   ├── models.py
    │   ├── test.py                                         # evaluation of trained model.
    │   ├── train.py                                        # main script to start training.
    │   ├── utils.py                                        # small utility functions.
    │   └── visualisation.py                                # functions to visualise the results.
    │
    ├── venv/                                               # virtual environment.
    │
    ├── .gitignore
    ├── LICENSE
    ├── main.py
    ├── Makefile
    ├── README.md                                           # top-level README for this project.
    └── requirements.txt                                    # requirements.txt file for reproducing the experiments.

Requirements

All the experiments were conducted using a 64-bit Intel(R) Core(TM) i7-7500U CPU with 16GB RAM in Windows 10 environment. The models have been implemented in Python v3.8.2 using the PyTorch v1.9.0 library.

References

[1] Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization”, 4th International Conference on Information Systems Security and Privacy (ICISSP), Portugal, January 2018

License

This project is released under the Apache 2.0 license.

Authors

Othmane Belarbi, Aftab Khan, Pietro Carnelli and Theodoros Spyridopoulos,

Citation

If you find this code useful in your research, please cite this article as:

@misc{belarbi2022intrusion,
    doi = {10.48550/ARXIV.2207.02117},
    url = {https://arxiv.org/abs/2207.02117},
    author = {Belarbi, Othmane and Khan, Aftab and Carnelli, Pietro and Spyridopoulos, Theodoros},
    keywords = {Cryptography and Security (cs.CR), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {An Intrusion Detection System based on Deep Belief Networks},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}