/Credit-Risk-Data-Project

This repository contains a loan data analysis project utilizing SQL and Python for data visualization, EDA and machine learning

Primary LanguageJupyter Notebook

Loan-Default-Data-Project

This repository contains a data analysis project utilizing data visualization, EDA and machine learning.

DISCLAIMER: All of the work is done by me, but I will be using a Hypothetical Lending and Financing company to keep things interesting.

SPRINT 1 - EXPLORATION

Gringotts Financial, a hypothetical Lending and Financing company, would like to better understand customer tendencies to default on loans in their loan financing program. The Analytics team will be examining historical data to understand what factors most affect loan defaults. Stakeholders are interested in better understanding the data they have to help inform decisions related to granting loans in the future. They will also like to understand which customers are MOST likely to default on loan repayments.

SPRINT 2 - VARIABLE IMPORTANCE ANALYSIS

The Analytics Team of Gringotts Financial will analyze the variables in the dataset for strength of association with each other and with the target variable loan_status. They will employ a Chi-Square Test of Independence, Cramer's V correlation coefficient, Pearson's correlation coefficeient and Point Biseral Correlation.

SPRINT 3 - LOAN DEFAULT PREDICTION

The Analytics team of Gringotts Financial would now look to build an engine to predict loan default for future customers. This analysis with visualizations was only part of the first sprint towards the stated solution. Stakeholders are now looking forward to seeing what the Analytics team comes up with (and so are you, I assume!)

Data Source: https://www.kaggle.com/datasets/laotse/credit-risk-dataset