The second dataset that was selected for the project contains customer details in the bank, and there are about 2 520 000 data including in 13 different columns with usual data types such as integer, string.
The dataset columns contain information about customer’s income, age, marital status, house ownership, car ownership, city, and state.
Dataset - https://www.kaggle.com/subhamjain/loan-prediction-based-on-customer-behavior
It is important to be sure about the customers of an organization. This is crucial when giving out loan products to a customer. The problem is that one cannot decide easily if giving out loan products to a customer would be risky or not. This task becomes less complicated when you analyze data about past customers. The above-mentioned dataset contains data about historic customer behavior. Here, our goal is to predict which customer is riskier and which customer is not based on those historic data using classification techniques.
This classification project is done with the assumption that the data in the dataset are accurate and suitable to predict and the given columns are related to one another. First, the dataset was made usable by preprocessing the data. After that, data mining techniques such as classification were used to build up the model. Jupyter Notebook is the major environment used to develop the model in Python language. User interface implementation is mainly done using HTML and CSS.
The selected dataset for the customer segmentation contains about 8068 data including 12 columns with usual data types such as integer, string, and float.
The dataset contains information about customers who bought vehicles in the past years, such as spending score, age, profession, marital status, gender, etc.
Dataset - https://www.kaggle.com/vetrirah/customer
Every customer is different and marketing efforts of an organization would be better served if they target specific, smaller groups with messages that those consumers would find relevant and lead them to buy something. Therefore, it is important to gain a deeper understanding of their customers' preferences and needs with the idea of discovering what each segment finds most valuable to more accurately tailor marketing materials toward that segment. Therefore, we use clustering techniques to come up with those segments (Luxury, Mid-range, Family, and Budget vehicles).
This clustering project is done with the assumption that the data in the dataset are accurate and suitable to predict and the given columns are related to one another. First, the dataset was made usable by preprocessing the data. After that, data mining techniques such as clustering were used to build up the model. Jupyter Notebook is the major environment used to develop the model in Python language. User interface implementation is mainly done using HTML and CSS.
This is a project done for the Fundamentals of Data Mining (IT3051) of BSc.(Hons.) Degree in Information Technology in Sri Lanka Institute of Information Technology