/Principal-Component-Analysis

A repository that contains the implementation of the PCA method. It involves working on the famous business problem - Wine Dataset.

Primary LanguagePython

Principal-Component-Analysis

Business Problem

The business problem of this famous dataset is that there are too many independent variables and one dependent variable. Assuming that the business owner gathered information of the independent variables and applied some clustering technique, identifying around 3 groups of customers. Each segemnt having a specific preferrence for a specific wine.

Here is a Logistic regression classification model that uses the Dimensionality reduction technique-PCA. For m independent variables in the dataset, PCA extracts p<m new independent variables that explain the most the variance of the dataset, regardless of the dependent variables.

That makes PCA as an unsupervised model,since independent variable is not considered.In this model we reduce the independent varaibles to 2 features and visualize our results.