Monte_Carlo_Simulation_Loan_Status

Author: Benjamin O. Tayo

Date: 11/22/2018

Introduction: Predicting the status of a loan is an important problem in risk assessment. A bank or financial organization has to be able to estimate the risk involved before granting a loan to a customer. Data Science and predictive analytics play an important role in building models that can be used to predict the probability of loan default. In this project, we are provided with a data set loan_timing.csv containing 50000 data points. Each data point represents a loan, and two features are provided as follows:

a) The column with header “days since origination” indicates the number of days that elapsed between origination and the date when the data was collected.

b) For loans that charged off before the data was collected, the column with header “days from origination to charge-off” indicates the number of days that elapsed between origination and charge-off. For all other loans, this column is blank.

Project Objective: The goal of this project is to use techniques of data science to estimate what fraction of these loans will have charged off by the time all of their 3-year terms are finished.

loan_timing.csv: the dataset

loan_timing.R: the R code

loan_timing_report.pdf: project report and summary