Behavior based malware detection using branch data
- Copyright (c) 2019 YoungJoong Kim. tf-branch-malware is licensed under the MIT license.
- This repository aims to write the poc code of branch based malware detection.
Tested environments
- Windows 10
- Python 3.7
Tested requirements
- numpy==1.16.2
- SQLAlchemy==1.3.1
- tensorflow-gpu==1.14.0
Download the sample log files from releases and unzip it.
Move log directories to cloned repository.
mv log tf-branch-malware\log
Preprocess the raw log files and data directory will be generated.
cd tf-branch-malware
python -m utils.preprocess
Train the model.
python -m classifier.train
Tensorboard support.
tensorboard --logdir=.\summary
Inference sample dataset.
python -m classifier.inference
For using sqlite database, generate malware db first.
python -m utils.malwaredb --debug
Start sqlite.
sqlite3 ./data/malware.db
select name from malware where id=4;
select symbol from branch where malware_id=4;
DB schema.
Table malware
column | type | content |
---|---|---|
id | Integer | identifier, primary key |
name | String | name of the malware |
Table branch
column | type | content |
---|---|---|
id | Integer | identifier, primary key |
order | Integer | order of branch data |
src_addr | Integer | source address |
dst_addr | Integer | destination address |
dll | String | name of the dll |
symbol | String | symbol of the destination address |
malware_id | Integer | foreign key of malware.id |