/malware-classification

Malware Byteplot Image Classification using Machine Learning and Deep Learning

Primary LanguageJupyter Notebook

Malware Byteplot Image Classification

Aim

A comparison of variation in model convergence and performance with change in class imbalance. To assess the variation three datasets are created with varying imbalance in class distribution namely Malimg dataset, Malevis dataset, and Blended dataset. For comparison, the state-of-the-art CNNs from Keras library are explored namely: InceptionNet, ResNet50, DenseNet169, EfficientNetB4, InceptionResNetV2, and VGG16

Datasets

  • The Malimg dataset is available here.
  • The Malevis dataset is available here and its website is here.
  • The Blended Malware dataset is available here.

Kaggle Notebooks

The tasks of Malware Classification and a minor fix to it were carried out in Kaggle using GPU P100. The codes can be found in the respective notebooks:

Results

The tabulated results for the comparison are shown in the figure below:

Conference Paper Citation

@misc{m2023comparative,
      title={Comparative Analysis of Imbalanced Malware Byteplot Image Classification using Transfer Learning}, 
      author={Jayasudha M and Ayesha Shaik and Gaurav Pendharkar and Soham Kumar and Muhesh Kumar B and Sudharshanan Balaji},
      year={2023},
      eprint={2310.02742},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}