Predictive Analytics is the stream of the advanced analytics which utilizes diverse techniques like data mining, predictive modelling, statistics, machine learning and artificial intelligence to analyse current data and predict future.
Loans default will cause huge loss for the banks, so they pay much attention on this issue and apply various method to detect and predict default behaviours of their customers.
The loan default dataset has 8 variables and 850 records, each record being loan default status for each customer. Each Applicant was rated as “Defaulted” or “Not-Defaulted”. New applicants for loan application can also be evaluated on these 8 predictor variables and classified as a default or non-default based on predictor variables.
# What is Classification?
In machine learning and statistics, classification is a supervised learning approach in which the computer program learns from the data input given to it and then uses this learning to classify new observation.
This data set may simply be bi-class (like identifying whether the person is male or female or that the mail is spam or non-spam) or it may be multi-class too. Some examples of classification problems are: speech recognition, handwriting recognition, bio metric identification, document classification etc.
# Data
Our task is to build classification model which will predict that the new applicant for loan application can be classified as default or non-default depending on yhe 8 predictor variables.
In a loan risk prediction situation of a loan financing company, the company would be interested in metrics such as how long it takes customers with certain attributes to pay back their loans and also, what is the possible risk of a default.
Generally, the company stands a higher risk of default from customers who have a bad credit rating or who have certain bad spending habits. In this situation, the company is very keen to find out if a customer will default or not. So, the past data observations gathered by the company are used to group customers into categories such as “Defaulter” or “Non-defaulter”.