/Anomaly_Detection

At Infosys Springboard, I worked on a project focused on unsupervised anomaly detection in healthcare providers. I implemented three machine learning algorithms—Isolation Forest, Elliptic Envelope, and One-Class SVM—as well as a deep learning approach using autoencoders. Additionally, I conducted individual SHAP analysis

Primary LanguageJupyter Notebook

Anomaly_Detection

This Anomaly Detection project which i have done during my tenure at the Infosys Springboard Internship.

Milestone 1 : EDA & Pre-processing

  • In this milestone have some basic pre-processing of dataset like handling missing values and replacing it with different techniques.
  • Done some visualizations to get better insights of data like univariate and bivariate analysis.

Milestone 2 : Clustering and analysis

  • Used K-means and DBSCAN clustering to analyze the different distinct groups/cluster between data.
  • used different techniques to get optimum no of clusters like Elbow Plot , Variance calculation and others.

Milestone 3 : Machine Learning Models

  • Done some Festure Engineering Part before applying ML models.
  • Encoded the categorical column with low unique value (having only 2 distinct value) with Binary encoding and columns with high cardinal values with frequency encoding to overcome from problem of Sparse data which can be occured by using other techniques.
  • Used Standard Scalar to Scale all The values.
  • Machine Learning Models
    • 1 . Isolation Forest
    • 2 . Elliptic Envelope
    • 3 . One-Class SVM
  • Done viualization for each Machine Learning Model like ScatterPlots for numerical columns to visualize anomalies and normal points for categorical columns used Barplots and Piecharts

Milestone 4 : Deep Learning Autoencoders

  • used simple feed-forward neural network for detecting anomalies
  • consisting 5 layers
  • used reconstruction error as metric to differentiate between Normal and anomalous entries based on threshold.
  • Done viualization for Deep Learning Learning Model like ScatterPlots for numerical columns to visualize anomalies and normal points for categorical columns used Barplots and Piecharts