Download the malware and normal dataset using
cd downloads
python wgetall.py
unzip all files inside "./downloads/malware_data" directory with password 'infected'
cd ./downloads/malware_data
unzip -P infected \*.zip
Now, we can remove all of the unnecessary files, except for pcap files, inside "./downloads/malware_data" directory
cd ./downloads/malware_data
find . -not -name '*.pcap' -delete
Move all of the malware pcap files into directory "./malware_pcap/train_source/"
and all of the normal pcap files into directory "./normal_pcap/train/"
Now, to parse http header from pcap files, run:
./extract_http.sh
If you want to parse tcp header instead of http, run this instead:
./extract_tcp.sh
Use PCA to reduce dimensions of initial payloads.
python ./visual/pca.py ./dataset/train
It will produced a pickle file
Finally to run the deep neural network model
python simple_dnn.py