Wine_Quality_Prediction 🍷

The aim of this project is to understand different types of learning algorithms on a popular wine quality dataset on kaggle using machine learning.

Libraries Used

Numpy

Importing Numpy Library
```
 import numpy as np
```
About Numpy

Numpy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
Pandas

Importing Pandas Library
```
import pandas as pd
```
About Pandas

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real-world data analysis in Python.
Seaborn

Importing Seaborn
```
import seaborn as sns
```
About Seaborn

Seaborn is a library for making statistical graphics in Python. It builds on top of matplotlib and integrates closely with pandas data structures.It helps you explore and understand your data. Its plotting functions operate on dataframes and arrays containing whole datasets and internally perform the necessary semantic mapping and statistical aggregation to produce informative plots.
Matplotlib

Importing Matplolib
```
import matplotlib.pyplot as plt
```
About Matplotlib

Matplotlib is easy to use and an amazing visualizing library in Python. It is built on NumPy arrays and designed to work with the broader SciPy stack and consists of several plots like line, bar, scatter, histogram, etc.
Sklearn

Importing Sklearn
```
import sklearn
```
About Sklearn

Scikit-learn (Sklearn) is the most useful and robust library for machine learning in Python. It provides a selection of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction via a consistence interface in Python. This library, which is largely written in Python, is built upon NumPy, SciPy and Matplotlib.

Algorithms Used

Logistic Regression

Importing Logistic Regression Classifier
```
from sklearn.linear_model import LogisticRegression
```
About
Logistic Regression is an easily interpretable classification technique that gives the probability of an event occurring, not just the predicted classification. It also provides a measure of the significance of the effect of each individual input variable, together with a measure of certainty of the variable's effect.
Decision Tree Classifier

Importing Decision Tree Classifier
```
from sklearn.tree import DecisionTreeClassifier
```
About
Decision tree is a non-parametric supervised learning algorithm, which is utilized for both classification and regression tasks. It has a hierarchical, tree structure, which consists of a root node, branches, internal nodes and leaf nodes.
Random Forest Classifier

Importing Random Forest Classifier
```
from sklearn.ensemble import RandomForestClassifier
```
About
Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of the random forest is the class selected by most trees. For regression tasks, the mean or average prediction of the individual trees is returned. Random decision forests correct for decision trees' habit of overfitting to their training set.
Support Vector Machine

Importing Support Vector Machine CLassifier
```
from sklearn import svm
```
About
Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is used for Classification as well as Regression problems. However, primarily, it is used for Classification problems in Machine Learning.The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane.SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called as support vectors, and hence algorithm is termed as Support Vector Machine.
KNeighbours Classifier

Importing KNeighbours Classifier
```
from sklearn.neighbors import KNeighborsClassifier
```
About
k-nearest neighbors algorithm, also known as KNN or k-NN, is a non-parametric, supervised learning classifier, which uses proximity to make classifications or predictions about the grouping of an individual data point. While it can be used for either regression or classification problems, it is typically used as a classification algorithm, working off the assumption that similar points can be found near one another
Gradient Boosting Classifier

Importing Gradient Boosting Classifier
```
from sklearn.ensemble import GradientBoostingClassifier
```
About
Gradient boosting algorithm is one of the most powerful algorithms in the field of machine learning. As we know that the errors in machine learning algorithms are broadly classified into two categories i.e. Bias Error and Variance Error. As gradient boosting is one of the boosting algorithms it is used to minimize bias error of the mode

Dataset Analysis

Quality_Count Analysis
Alcohol v/s Quality Plot
- Using Barplot visualizing the change of quality of wine on the basis of alcohol amout present in it
HeatMap Analysis of Features
- Determing the co-relation of different features among each other
Features Pairplot Analysis
- Pairplot brings the ability of visualizing all features against each other at the same time

Model Analysis

Plotting the accuracy of different models used

Arnav131003/Wine_Quality_Prediction

Wine_Quality_Prediction 🍷

Libraries Used

Numpy

Importing Numpy Library

About Numpy

Pandas

Importing Pandas Library

About Pandas

Seaborn

Importing Seaborn

About Seaborn

Matplotlib

Importing Matplolib

About Matplotlib

Sklearn

Importing Sklearn

About Sklearn

Algorithms Used

Logistic Regression

Importing Logistic Regression Classifier

About

Decision Tree Classifier

Importing Decision Tree Classifier

About

Random Forest Classifier

Importing Random Forest Classifier

About

Support Vector Machine

Importing Support Vector Machine CLassifier

About

KNeighbours Classifier

Importing KNeighbours Classifier

About

Gradient Boosting Classifier

Importing Gradient Boosting Classifier

About

Dataset Analysis

Quality_Count Analysis

Alcohol v/s Quality Plot

HeatMap Analysis of Features

Features Pairplot Analysis

Model Analysis

Enjoy your wine