My dissertation for my undergraduate degree. Research into Deep Learning Malware Detection by representing files as images how to create Adversarial Variants of Malware. The research looks into classifying files by breaking down the binary structure into individual sections and evaluating these in unique ways. The Adversarial Variants are created uses File Infection techniques.
The provided code requires Python 2 and Python 3. It has been tested with Python 3.6.9 and Python 2.7.17. Python 2 is used for generating Adversarial Variants with the files fileInjection.py and sectionInjection.py. Python 3 is used for all other files.
- pefile==2019.4.18
- pefile==2019.4.18
- Keras==2.3.1
- scipy==1.4.1
- numpy==1.18.3
- Pillow==7.1.1
- statistics==1.0.3.5
- sklearn==0.0
- tensorflow==1.14.0
- matplotlib==3.2.1
The Appendix in the report contains more information on how each file can be used.
python3 src/fileClassifier.py -g data/goodwareFileImages/ -m data/malwareFileImages/ -a data/adversarialFileImages/ -e 20
python3 src/sectionClassifier.py -g data/goodwareSectionImages/ -m data/malwareSectionImages/ -a data/adversarialSectionImages/ -e 20
There are two scripts for helping to create a new dataset createsectiondata
and createfiledata
, this first requires gathering your own goodware and malware.
# create directories for the files
mkdir /path/goodwareFiles
mkdir /path/malwareFiles
# copy the dataset you are using into goodwareFiles and
malwareFiles before carrying on
# copy malware into adversarial variants, note that the
adversarial variants will be altered.
cp -R /path/malwareFiles /path/adversarialFiles
Then make the following changes to the scripts. The datadir
needs to contain the path to the directories just created. The projectpath
variable needs to contain the path to the project root. The infectionFilename
needs to be the name of the file that will be used to infect the Malware to create Adversarial Variants.
datadir="/path/"
projectpath="/home/user/malwaredetection/"
infectionFilename="example.exe"