/MalwarePrediciton

Primary LanguageC#Apache License 2.0Apache-2.0

Malware Prediction

This is the Project that competed in kaggle : https://www.kaggle.com/c/microsoft-malware-prediction

The project is extract the feature in user computer environment, after that using Machine learning algorithms  to determine the probability of the presence of malware.


execute the c# program, will extract the data in windows 10 environment. It doesen't work in Windows 7, 8.. else only windows 10.

ofcourse There are Windows7 and Windows8 in the dataset, but I haven't found an extraction method yet

Due to the old dataset, there are often some values ​​extracted from the latest Windows versions that are not suitable for machine learning.

So, please refer to it as it was forcibly modified when writing to the csv file.


After the execute, directory will created in C:\MalwarePrediction 

before execute lightgbm.py, you must download train.csv in https://www.kaggle.com/c/microsoft-malware-prediction/data

and must be in the same path as lightgbm.py

lightgbm.py referred to https://www.kaggle.com/itamargr/lightgbm-full-data


The dataset and extraction method used in Feature_List.xlsx are summarized.


You will often see Korean language in the Excel file or code. I will fix this later.

ReadMe.txt will also be revised for better readability later.


Actually most census data can be extracted through 'Diagnostic Data Viewer' and 'DeviceCensus'


////////////////////////////
comment
////////////////////////////

In fact, there is a separate prototype product shown in the video. The code in the current repository is meager because we are refactoring the code neatly.

Now, I am studying xaml. so I will rip the gui part and the code and fix it after learning a bit.



+ I got off because my personal information was in the video.