/creditcardfraud_detection

The challenge of an extremely imbalanced dataset is addressed in this project, where the class of fraudulent transactions is significantly smaller than the class of legitimate transactions. To overcome this challenge, resampling methods are implemented.

Primary LanguageJupyter Notebook

CreditCardFraud_Detection

The challenge of an extremely imbalanced dataset is addressed in this project, where the class of fraudulent transactions is significantly smaller than the class of legitimate transactions. To overcome this challenge, resampling methods are implemented.

This project focuses on developing a credit card fraud detection system using machine learning techniques. The goal is to identify fraudulent transactions within a credit card dataset. The project utilizes Python and the Jupyter Notebook environment.

Dataset The dataset used for credit card fraud detection is the "creditcard.csv" file. This dataset contains credit card transactions made by users in September 2013. It was obtained from Kaggle and is provided by the Machine Learning Group at ULB (Université Libre de Bruxelles).

Link to the dataset: https://www.kaggle.com/mlg-ulb/creditcardfraud

Files Included creditcardfraud_detection.ipynb: This Jupyter Notebook contains the source code for the credit card fraud detection system. It provides step-by-step instructions on data preprocessing, model training, and evaluation.

creditcard.csv: The dataset file containing credit card transaction data.

README.md: This file.

Instructions To run the credit card fraud detection system, follow these steps:

Download the files from the repository to your local machine.

Launch Jupyter Notebook, JupyterLab or Google Colab.

Open the creditcardfraud_detection.ipynb notebook.

Ensure that you have the necessary Python libraries installed, such as Pandas, NumPy, Scikit-Learn, and Matplotlib. You can install them using the command !pip install [library name] if needed.

Run each cell in the notebook sequentially to execute the code and perform the steps of data preprocessing, model training, and evaluation.

The notebook will generate visualizations and provide insights into the fraud detection results.

Notes It is important to note that the credit card fraud detection system aims to detect fraudulent transactions based on patterns and anomalies in the provided dataset. However, it may not guarantee 100% accuracy in real-world scenarios.

This project incorporates resampling methods to address the issue of imbalanced data, where the class of fraudulent transactions is significantly smaller than the class of legitimate transactions.

Credits The credit card dataset used in this project was obtained from Kaggle and provided by the Machine Learning Group at ULB.

Resampling techniques reference: https://www.analyticsvidhya.com/blog/2020/07/10-techniques-to-deal-with-class-imbalance-in-machine-learning/

Authors: Xiang Liu, Mabel Mires, Natalia Benitez