Credit Score Analysis in Python

A credit scoring model is a tool that is typically used in the decision-making process of accepting or rejecting a loan. A credit scoring model is the result of a statistical model which, based on information about the borrower (e.g. age, number of previous loans, etc.), allows one to distinguish between "good" and "bad" loans and give an estimate of the probability of default.

Loan Defalut: In finance, default is failure to meet the legal obligations of a loan, for example when a home buyer fails to make a mortgage payment, or when a corporation or government fails to pay a bond which has reached maturity.

Steps Involved

Part 1 - Data Processing: Cleaning and Transforming Raw Data into the Understandable Format
Part 2 - Profiling: Data profiling is the process of examining the data available from an existing information source (e.g. a database or a file) and collecting statistics or informative summaries about that data.
Part 3 - Logistic Regression Model: In statistics, logistic regression, or logit regression, is a regression model where the dependent variable is categorical. This article covers the case of a binary dependent variable—that is, where the output can take only two values, "0" and "1", which represent outcomes such as pass/fail, win/lose etc.
Part 4 - Decision Tree Model
Part 5 - Clustering Analysis
Part 6 - Principal Components Analysis

The raw dataset is in the file "CreditScoring.csv" which contains 4455 rows and 14 columns:

1 Status	credit status
2 Seniority	job seniority (years)
3 Home	type of home ownership
4 Time	time of requested loan
5 Age	client's age
6 Marital	marital status
7 Records	existance of records
8 Job	type of job
9 Expenses	amount of expenses
10 Income	amount of income
11 Assets	amount of assets
12 Debt	amount of debt
13 Amount	amount requested of loan
14 Price	price of good

Purusharth07/Credit_Score_Analysis

Credit Score Analysis in Python

Steps Involved