/Data-Mining-Case-Study-Project

In this project, for supervised learning, I used regression and decision tree techniques to build predictive models and tested model accuracy by evaluating MSE and misclassification cost. For unsupervised learning, I performed cluster analysis on Iris dataset to identify subgroups and I used association rules to analyze transaction details in the Groceries dataset.

Primary LanguageR

[Note: you can preview files that are in R and PDF format by clicking on the file]

Software I used:

RStudio

Description:

Supervised learning: I was in charge of writing codes to build linear regression model, logistic regression model, and decision tree models (both regression and classification trees) on the Boston Housing and Credit Card dataset, and tested model accuracy by evaluating MSE values (for numerical response variable) and misclassification costs (for binary response variable). And then I concluded the analysis with several written reports, with the help of my partner by delegating tasks to my partner. The output of my codes along with my description for each output are also included in all reports.

Unsupervised learning: I was in charge of writing codes to perform cluster analysis (K-means and Hierarchical) on the Iris dataset to better understand clusters with similar and dissimilar characteristics, and to find unknown subgroups. I also wrote codes to explore Groceries dataset and analyzed transaction details based on association rules, and I concluded my analysis with a written report, with the help of my partner. The output of my codes and the description of my output are also included in the report.