FDM-Mini-Project

About tye project

A. Loan Prediction Based on Customer Behavior

a. Introduction to data set

The second dataset that was selected for the project contains customer details in the bank, and there are about 2 520 000 data including in 13 different columns with usual data types such as integer, string. The dataset columns contain information about customer’s income, age, marital status, house ownership, car ownership, city, and state.
Dataset - https://www.kaggle.com/subhamjain/loan-prediction-based-on-customer-behavior

b. Problem Identification

It is important to be sure about the customers of an organization. This is crucial when giving out loan products to a customer. The problem is that one cannot decide easily if giving out loan products to a customer would be risky or not. This task becomes less complicated when you analyze data about past customers. The above-mentioned dataset contains data about historic customer behavior. Here, our goal is to predict which customer is riskier and which customer is not based on those historic data using classification techniques.

c. Introduction to the project

This classification project is done with the assumption that the data in the dataset are accurate and suitable to predict and the given columns are related to one another. First, the dataset was made usable by preprocessing the data. After that, data mining techniques such as classification were used to build up the model. Jupyter Notebook is the major environment used to develop the model in Python language. User interface implementation is mainly done using HTML and CSS.


B. Customer Segmentation to introduce different automobile categories

a. Introduction to dataset

The selected dataset for the customer segmentation contains about 8068 data including 12 columns with usual data types such as integer, string, and float. The dataset contains information about customers who bought vehicles in the past years, such as spending score, age, profession, marital status, gender, etc.
Dataset - https://www.kaggle.com/vetrirah/customer

b. Problem Identification

Every customer is different and marketing efforts of an organization would be better served if they target specific, smaller groups with messages that those consumers would find relevant and lead them to buy something. Therefore, it is important to gain a deeper understanding of their customers' preferences and needs with the idea of discovering what each segment finds most valuable to more accurately tailor marketing materials toward that segment. Therefore, we use clustering techniques to come up with those segments (Luxury, Mid-range, Family, and Budget vehicles).

c. Introduction to the project

This clustering project is done with the assumption that the data in the dataset are accurate and suitable to predict and the given columns are related to one another. First, the dataset was made usable by preprocessing the data. After that, data mining techniques such as clustering were used to build up the model. Jupyter Notebook is the major environment used to develop the model in Python language. User interface implementation is mainly done using HTML and CSS.

Tools and Technologies used

  • Bootstrap

  • Python 3

  • Heroku

Acknowledgment

This is a project done for the Fundamentals of Data Mining (IT3051) of BSc.(Hons.) Degree in Information Technology in Sri Lanka Institute of Information Technology