A comparison of variation in model convergence and performance with change in class imbalance. To assess the variation three datasets are created with varying imbalance in class distribution namely Malimg dataset, Malevis dataset, and Blended dataset. For comparison, the state-of-the-art CNNs from Keras library are explored namely: InceptionNet, ResNet50, DenseNet169, EfficientNetB4, InceptionResNetV2, and VGG16
- The Malimg dataset is available here.
- The Malevis dataset is available here and its website is here.
- The Blended Malware dataset is available here.
The tasks of Malware Classification and a minor fix to it were carried out in Kaggle using GPU P100. The codes can be found in the respective notebooks:
The tabulated results for the comparison are shown in the figure below:
@misc{m2023comparative,
title={Comparative Analysis of Imbalanced Malware Byteplot Image Classification using Transfer Learning},
author={Jayasudha M and Ayesha Shaik and Gaurav Pendharkar and Soham Kumar and Muhesh Kumar B and Sudharshanan Balaji},
year={2023},
eprint={2310.02742},
archivePrefix={arXiv},
primaryClass={cs.LG}
}