An Android Malware Detection system built in python that classifies a given android app as malware or benign.
The system does static analysis of the Manifest XML file of the android app and gives the result. The system does the static analysis by using a pre-generated data file containing frequent patterns from 35000 malware applications and 70000 benign applications. This data file was generated by using apriori algorithm on the XML components of these many applications.
This analysis generates the features of the input applications. These features are nothing but the similarity scores calculated by comparing XML Manifest file of the input application to the frequent patterns in the data file.
The system then uses a trained Machine Learning model to classify the application. This model was trained using Random Forest Algorithm on 15000 benign applications and 8500 malware applications. Similarity scores for all these 23500 applications were calculated and the dataset generated was used to train the classifier.