The purpose of this analysis was to utilize credit card datasets from an actual company (LendingClub) and use linear regression to sample and predict the data. The data was oversampled through the use of the RandomOverSampler and SMOTE algorithms, while it was under sampled using the ClusterCentroids algorithm. After obtaining the results, The BalancedRandomForestClassifier and EasyEnsembleClassifier models were utilized to predict credit risk.
michaelxie1/Credit_Risk_Analysis
In 2019, more than 19 million Americans had at least one unsecured personal loan. Personal lending is growing at an extremely fast rate, and FinTech firms need to go through an organize large amounts of data in order to optimize lending. Python will be used to evaluate several machine learning models to predict credit risk. Algorithms such as RandomOverSampler, SMOTE, and RandomForest will be used to analyze credit card datasets from a company (LendingClub) and use linear regression to both sample and predict data. This data can be used to determine the number of people who are predicted to be at high/low risk for credit risk.
Jupyter Notebook