This repository explores Feature Selection methods for Optimization of machine learning for Detection of Malicous activities in IoT datasets. This project is in support of the work that is currently under review from a high quality journal, more details will follow after the review process is completed.
This project aims at stimulating research in the area of machine learning, particularly involving feature selection, features extraction and feature engineering approaches. The dataset explored: https://www.stratosphereips.org/datasets-iot23. The initial step is to employ feature selection approaches such as chi-square to compute the importance of each of the features in prediction target accurately. Once the scores have been determined the only the best scoring features are selected for training a machine learning model.
Several machine learning algorithms can be trained such a Decision Tree, Random Forrests, Naive Bayes, SVM e.t.c. The challenge is woriking out criteria for selecting the best set of features for training. This projcet comes with two python code:Dataset-Filter.py and K-Means-Clustering.py.
A link to data mining opensource software is provided, it can be used when data mining approaches are required: https://www.philippe-fournier-viger.com/spmf/