/deep-learning-malware-detection

Final year project for my BSc Computer Science degree, on using Deep Learning for classifying Malware and creating Adversarial Examples.

Primary LanguagePython

Deep Learning Image-based Malware Detection and Adversarial Variants

My dissertation for my undergraduate degree. Research into Deep Learning Malware Detection by representing files as images how to create Adversarial Variants of Malware. The research looks into classifying files by breaking down the binary structure into individual sections and evaluating these in unique ways. The Adversarial Variants are created uses File Infection techniques.

Requirements

The provided code requires Python 2 and Python 3. It has been tested with Python 3.6.9 and Python 2.7.17. Python 2 is used for generating Adversarial Variants with the files fileInjection.py and sectionInjection.py. Python 3 is used for all other files.

Python 2

  • pefile==2019.4.18

Python 3

  • pefile==2019.4.18
  • Keras==2.3.1
  • scipy==1.4.1
  • numpy==1.18.3
  • Pillow==7.1.1
  • statistics==1.0.3.5
  • sklearn==0.0
  • tensorflow==1.14.0
  • matplotlib==3.2.1

Usage

The Appendix in the report contains more information on how each file can be used.

File Classifier

python3 src/fileClassifier.py -g data/goodwareFileImages/ -m data/malwareFileImages/ -a data/adversarialFileImages/ -e 20

Section Classifier

python3 src/sectionClassifier.py -g data/goodwareSectionImages/ -m data/malwareSectionImages/ -a data/adversarialSectionImages/ -e 20

Creating a new dataset

There are two scripts for helping to create a new dataset createsectiondata and createfiledata, this first requires gathering your own goodware and malware.

# create directories for the files 
mkdir /path/goodwareFiles 
mkdir /path/malwareFiles
# copy the dataset you are using into goodwareFiles and 
malwareFiles before carrying on

# copy malware into adversarial variants, note that the 
adversarial variants will be altered.
cp -R /path/malwareFiles /path/adversarialFiles

Then make the following changes to the scripts. The datadir needs to contain the path to the directories just created. The projectpath variable needs to contain the path to the project root. The infectionFilename needs to be the name of the file that will be used to infect the Malware to create Adversarial Variants.

datadir="/path/"
projectpath="/home/user/malwaredetection/"
infectionFilename="example.exe"