Most cyber solution solutions don't distinguish between everyday malware and advanced targeted attacks. Important alerts get lost in the noise of unimportant alerts, allowing attacks to breach security.
The solution is divided into multiple steps which are:
- Convert the given Dataset from Json to CSV along with wrangling of Data
- Preprocessing of Data which included feature engineering on Date-Time, dealing with categorical features, encoding the data and scaling the data.
- Used K-Means for clustering similar data groups and patterns to identify outliers in dataset
- Used Principal Component Analysis in order to visualise cluster of datasets.
- Used Long Short Term Memory (LSTM) Networks inorder to identify malicious patterns in time series log.