Key finding: 1) Borrowers who are small business owners, do not meet the credit policy of lenders, have a higher interest rate, and have low fico are associated with high default risk. 2) The Logistic Regression model performs best in predicting loan default.
Online lending platforms have experienced a rapid development in recent years thanks to their convienience and feasibility. However, they are facing various difficulties related to loan default, given their clients are individual or small business owners, and borrowers with low income who had been rejected by traditional banks.
- Identify factors associated with repayment failures beased on financial information provided by customers.
- Training a Machine Learning model that is capable of predicting defaulters and non-defaulters based on clients’ financial information, in order to provide suitable support for loan approval decision making.
https://www.kaggle.com/datasets/itssuru/loan-data
The results of data analysis showed that borrowers who do not meet the credit policy of lenders, have a higher interest rate, and low fico are associated with a high default risk. The high risk is also observed in customers with the purpose of borrowing listed as “small business”.
###Proves are shown in figures and table below:
Counts of clients according to credit criteria (left) and percentage of fully paid/not fully paid clients in each type of credit policy (right).
Scatterplot of interest rate and fico, grouped by loan paid.
28.59% customers who do not meet credit policy with interest rate greater than 0.1 and fico lesser 737 wouldn't pay for their loan.
Percentage of not_fully_paid/ fully_paid by purpose of borrowing!