/Efficient-Cloud-Integrated-Distributed-Deep-Neural-Network-Framework-for-IoT-Malware-Classification

in this project we used image processing Technique to classify 9 class malwares our final goal is to reach an appropriate model with high accuracy and small size and computational cost

Primary LanguageJupyter NotebookMIT LicenseMIT

Note: This repository is still updating for more clearance of our work. These codes are associated with our paper and are under the same name. We propose reading our paper and this repository to better understand the code.

Malware Detection and Classification Using Deep Learning-based Image Processing Approaches

in this project, we used image processing Techniques to classify 9 class malware our final goal is to reach an appropriate model with high accuracy and small size and computational cost we introduced a HierarchicalCloudDNN model to be used for effective and efficient malware detection in real-world scenarios

Dataset

we used BIG2015 dataset available in this link
this dataset contains different kinds of information about these 9 classes; here, we used .bytes files and converted them to images that can be used by image classification methods. images are 256*256 in size and are single channel.

Requirements

to install requirements of project run code below in your command line

pip install -r requirements.txt

Codes Description

Within this repository exists a subdirectory titled byte_to_image, which encompasses code designed to preprocess your .byte data into images of a specified size, specifically 256 * 256 in our particular scenario. Certain codes are associated with data preparation, which applies to both local and onboard settings.

The aforementioned codes encompass distinct models: Original models that, in some cases, have some small modification to be used in our pipeline and Improved models that are improved in different aspects, including statistical, resource, and time measures. It is worth noting that several models showed improvements, as indicated chiefly by their respective titles. Many models in our study include ResNet-101, ResNet-50, ResNet-18, MobileNetV2, MobileViT, and SqueezeNet, along with their enhanced and adapted iterations. In the subsequent section, one may locate several codes pertaining to our distinct variant of the Hierarchical model, commonly referred to as the pipeline. Within these codes, we have employed the aforementioned models discussed before in order to achieve optimal runtime and resource utilization, all while upholding accuracy.

Note: Please be advised to utilize cuda to speed up the training process. Failure to do so may result in a significantly longer duration.

Codes that include the word "pipeline" in their names are codes that have different implementations of our proposed HierarchicalCloudDNN model. Codes that include the word board in their names work with our smaller data set used to determine some of our results on the NVidia Jetson Nano board. There are also two types of codes to reorder and split our data (both our main and our smaller on-board data) to proper sets and associate them with .csv files to be used further.