Utku ALTINKAYA |
İlteriş SAMUR |
The primary objective of this project is to build a machine learning model capable of predicting individual's credit scores based on their financial attributes. By advanced algorithms and techniques, we aimed to develop a model that can effectively generalize patterns in the data and provide accurate credit score predictions.
- ID: Unique identification of an sample
- Customer_ID: Unique identification of a person
- Month: Month of the year
- Name: Name of a person
- Age: Age of a person
- SSN: Social security number of a person
- Occupation: Job of a person
- Annual Income:Yearly income of a person
- Monthly Inhand Salary: Monthly salary of a person
- Num Bank Accounts: Number of bank accounts a person
- Num Credit Card: Number of credit cards of a person
- Interest_Rate: Interest rate on credit card of a person
- Num of Loan: Number of debt taken from the bank of a person
- Type of Loan: Types of credit taken by a person
- Delay from due date: Average number of days delayed from the payment date
- Num of Delayed Payment: Average number of payments delayed by a person
- Changed Credit Limit: Percentage change in credit card limit
- Num Credit Inquiries: Number of credit card inquiries
- Credit Mix: Classification of the types of credits
- Outstanding Debt: Remaining debt to be paid (in USD)
- Credit Utilization Ratio: Percentage of revolving credit of a person using credit card
- Credit History Age: Age of credit history of the person
- Payment of Min Amount: Whether only the minimum amount was paid by the person
- Total EMI per month: The monthly installment payments of a person(in USD)
- Amount invested monthly: Monthly amount invested by the person (in USD)
- Payment Behaviour: The payment behavior of a person (in USD)
- Monthly_Balance: Represents the monthly balance amount of a person (in USD)
- Credit Score: Represents the bracket of credit score (Poor, Standard, Good)
- Data Preprocessing: We cleansed the dataset with handle missing values, encode categorical features and balance unbalanced data processes.
- Feature Engineering: We extracted relevant features and transform data for better model performance by using Lasso, chi2, MIC, Ridge, RFE, and PCA
- Model Training: We trained model with selected features by using Random Forest Classification, Decision Tree Classification, and Gradient Boosting Classifier. Also, we used Stacking and Max Voting to train.
- Model Evaluation: We assessed the model's performance using metrics like accuracy, precision, recall, F1-score, and ROC-AUC.