This company is the largest online loan marketplace, facilitating personal loans, business loans, and financing of medical procedures. Borrowers can easily access lower interest rate loans through a fast online interface.
Like most other lending companies, lending loans to ‘risky’ applicants is the largest source of financial loss (called credit loss). The credit loss is the amount of money lost by the lender when the borrower refuses to pay or runs away with the money owed. In other words, borrowers who default cause the largest amount of loss to the lenders. In this case, the customers labelled as 'charged-off' are the 'defaulters'.
If one is able to identify these risky loan applicants, then such loans can be reduced thereby cutting down the amount of credit loss. Identification of such applicants using EDA is the aim of this case study.
In other words, the company wants to understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default. The company can utilise this knowledge for its portfolio and risk assessment.
Here we will be using Exploratory Data Analysis to find key drivers which can help the business perform better.
-
What is the background of your project?
- The Project in target to give user a window into how important EDA is the lifecycle of Data Analysis and prediction.
-
What is the business problem that your project is trying to solve?
- We are trying to mark out key drivers which can be used to screen the applications so chances of loan turning bad (or it defaulting) reduces greatly.
-
What is the dataset that is being used?
- Lending Club Loan Dataset having data from 2007 to 2011.
- Around 85% applicants have fully paid back the loan while rest 15% are charged off.
- More amount of loan should be given for wedding purpose since default rate is very low and number of loans are quite less in this category, same is true for car loans.
- Renewable Energy loans applications are already less they should be completely avoided.
- Higher interest rate and Higher amount loans should be given to applicants with higher annual income as they have capability to pay off loans as this has negative correlation with dti since monthly income is greater
- The employeer reputation is also a driver, hence applicants from reputed organisations like US Army etc should be given prefernece.
- Applications approved in December should be scrutnized more since they are having higher defaults.
- pandas - 1.3.4
- numpy - 1.20.3
- seaborn - 0.11.2
- matplotlib : 3.4.3
- This project is outcome of case study on lending club.
- US Censes data was used from for region binning : https://www2.census.gov/geo/pdfs/maps-data/maps/reference/us_regdiv.pdf
Created by [@jaskirat8] and [@kapiljain2825] - feel free to contact us!