/fraudDetection

Fraud detection project using R, trees and unbalanced data

Primary LanguageTeX

Fraud Detection Project

Mini abstract

I present this project with some ideas about how to deal with unbalanced data, use metrics different than the typical accuracy measure and explore the use of trees, a lot of trees, CART, random forest and XGBoost with balance and unbalanced data. All of this to get a good prediction for a very small quantity of fraud cases.

Note: All of this using R.

Dataset

The dataset used is on ./data/creditCard.zip.

The code provided on the .R file will unzip and prepare the data.

In case of any problems you can download it from Kaggle Credit Card Fraud Detection