Credit Score Prediction

Creators

Utku ALTINKAYA

İlteriş SAMUR

Kaggle Link of Project

About the Project

The primary objective of this project is to build a machine learning model capable of predicting individual's credit scores based on their financial attributes. By advanced algorithms and techniques, we aimed to develop a model that can effectively generalize patterns in the data and provide accurate credit score predictions.

Features

ID: Unique identification of an sample
Customer_ID: Unique identification of a person
Month: Month of the year
Name: Name of a person
Age: Age of a person
SSN: Social security number of a person
Occupation: Job of a person
Annual Income:Yearly income of a person
Monthly Inhand Salary: Monthly salary of a person
Num Bank Accounts: Number of bank accounts a person
Num Credit Card: Number of credit cards of a person
Interest_Rate: Interest rate on credit card of a person
Num of Loan: Number of debt taken from the bank of a person
Type of Loan: Types of credit taken by a person
Delay from due date: Average number of days delayed from the payment date
Num of Delayed Payment: Average number of payments delayed by a person
Changed Credit Limit: Percentage change in credit card limit
Num Credit Inquiries: Number of credit card inquiries
Credit Mix: Classification of the types of credits
Outstanding Debt: Remaining debt to be paid (in USD)
Credit Utilization Ratio: Percentage of revolving credit of a person using credit card
Credit History Age: Age of credit history of the person
Payment of Min Amount: Whether only the minimum amount was paid by the person
Total EMI per month: The monthly installment payments of a person(in USD)
Amount invested monthly: Monthly amount invested by the person (in USD)
Payment Behaviour: The payment behavior of a person (in USD)
Monthly_Balance: Represents the monthly balance amount of a person (in USD)
Credit Score: Represents the bracket of credit score (Poor, Standard, Good)

Methodology

Data Preprocessing: We cleansed the dataset with handle missing values, encode categorical features and balance unbalanced data processes.
Feature Engineering: We extracted relevant features and transform data for better model performance by using Lasso, chi2, MIC, Ridge, RFE, and PCA
Model Training: We trained model with selected features by using Random Forest Classification, Decision Tree Classification, and Gradient Boosting Classifier. Also, we used Stacking and Max Voting to train.
Model Evaluation: We assessed the model's performance using metrics like accuracy, precision, recall, F1-score, and ROC-AUC.

References