/Credit_Risk_Analysis

In 2019, more than 19 million Americans had at least one unsecured personal loan. Personal lending is growing at an extremely fast rate, and FinTech firms need to go through an organize large amounts of data in order to optimize lending. Python will be used to evaluate several machine learning models to predict credit risk. Algorithms such as RandomOverSampler, SMOTE, and RandomForest will be used to analyze credit card datasets from a company (LendingClub) and use linear regression to both sample and predict data. This data can be used to determine the number of people who are predicted to be at high/low risk for credit risk.

Primary LanguageJupyter Notebook

Credit_Risk_Analysis

The purpose of this analysis was to utilize credit card datasets from an actual company (LendingClub) and use linear regression to sample and predict the data. The data was oversampled through the use of the RandomOverSampler and SMOTE algorithms, while it was under sampled using the ClusterCentroids algorithm. After obtaining the results, The BalancedRandomForestClassifier and EasyEnsembleClassifier models were utilized to predict credit risk.

Undersampling

Undersampling

Oversampling

Oversampling (pt1) Oversampling (pt2)

Smote Oversampling

SMOTE Oversampling

Combination Sampling

Combination Sampling

RandomForestClassifier

Random Forest