/MalwareDetector

Malware Detection Model trained to detect various types of malware

Primary LanguagePython

Dynamic Analysis

Feature Selection and Extraction

A part of our code is used to extract several features from files such as API calls,Descriptions, etc with their counts in all files taken, which was used to get a general insight about the data. After a bit of analysis,we have decided to tokenise all our files(Malware and Bening) with Description strings being the column vector to generate a CSV file(Description.csv).We have used a set of all the Description Strings we get in Dynamic Analysis in set 2 and 3 of Benign and Malware File after we have even excluded numbers from strings for example which stated "58 Antiviruses" to "Antiviruses".

Training and Output

Model with XGBClassifier which process on Description.csv and produces the final output csv file - Predictions.csv.