This project focuses on building a credit card fault detection model using machine learning techniques. The model aims to predict default payments based on various demographic and credit-related features.
The dataset used in this project contains information on default payments, demographic factors, credit data, history of payment, and bill statements of credit card clients in Taiwan. It includes 25 variables such as ID, LIMIT_BAL, SEX, EDUCATION, MARRIAGE, AGE, PAY_0 to PAY_6, BILL_AMT1 to BILL_AMT6, PAY_AMT1 to PAY_AMT6, and the target variable default.payment.next.month.
- Python 3.8
- Dependencies listed in requirements.txt
pip install -r requirements.txt
- Data Exploration: Explore the dataset to understand its structure and characteristics.
- Data Preprocessing: Balancing imbalance data, and scale numerical features.
- Model Selection: Train and evaluate various machine learning models (SVM, KNN, Decision Tree, Gradient Boosting, Logistic Regression, AdaBoosting, Naive Bayes).
- Model Training: Choose the best-performing model and train it on the dataset.
- Results: Evaluate and analyze the model's performance, strengths, weaknesses, and any challenges encountered.
- Deployment on AWS: Deploy the model on AWS for real-world applications.
- Video Demonstration: Check out this link for a demonstration of the HTML page.
The Gradient Boosting model emerged as the most effective, achieving an accuracy of 79.08% on the test set.
Explore additional features, fine-tune hyperparameters, and consider more advanced techniques for handling imbalanced data.
- Gouthami K
- Abhijith Paul
This project is licensed under the MIT License.