/Anomaly-Detection---IF-LOF

Anomaly detection using unsupervised method is a challenging one. Isolated Random Forest and Local Outlier Factor are the most promising one. They detect outlier with highest recall possible.

Primary LanguageJupyter Notebook

ANOMALY DETECTION USING ISOLATION FOREST & LOCAL OUTLIER FACTOR

As per industry report Fraud Detection & Prevention Market was approx. $20Billion in 2018 and will be $63.5Billion by 2023, at a Compound Annual Growth Rate (CAGR) of 26.7%

Major challenges in Building Fraud Model: -

  1. Imbalanced Class Data – On an average, only 1-2% of total transactions is identified as fraud which one of the biggest challenges for data scientist to build robust model to identify transactions.

  2. Labelled Data – Non availability of labelled data as legitimate or fraud. Most of time data at organization level is not labelled or small amount of labelled data and labelling them is tedious and cost.

  3. Sometimes the fraud actives are mixed up with the normal activities, hard to identify using general algorithms

  4. Fraudlent activities (in case of few reported incidents) tend to change their mode of transaction and process

anomaly-detection-using-machine-learning-and-deep-learning

DATA :

https://www.kaggle.com/mlg-ulb/creditcardfraud

REFERENCE :

https://towardsdatascience.com/anomaly-detection-with-isolation-forest-visualization-23cd75c281e2

TARGET METRIC : RECALL %