Identify fraudulent credit card transactions.
Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.
Dataset:
The dataset for this project is taken from kaggle website. [https://www.kaggle.com/mlg-ulb/creditcardfraud]
The brief overview of the notebook:
- Exploratory Data Analysis
- Mistakes to avoid when dealing with imbalanced data
- Random Undersampling and Oversampling(SMOTE)
- Logistic Regression and XGBoost Classifier
- Tips to improve the results.