wine_variety_predictor: A Jupyter Notebook repository from Ayush-Parhi

This wine variety predictor is capable of predicting the variety of wine consumed by the customers. It predicts the variety by going through the reviews given by the customers. The main aim of this repository is to practice NLP techniques and its implementations.

Summary of techniques used in the notebook:

Data visualization
Data Analysis
Data preprocessing
- Feature Extraction
- Feature Engineering
  - Handling missing values
  - Feature selection

Tokenization of the reviews
Encoding the reviews
Model architecture -Train/Dev split -RNN model building -Hyperparameter tuning
Training the model
Model performance Diagnosis
Prediction of the variety of wine
- Decoding the reviews
- Predictions
Predicting the variety for test dataset
Saving a new CSV file as variety_predictions

The Data Description is as follows:

user_name - user_name of the reviewer
country -The country that the wine is from.
review_title - The title of the wine review, which often contains the vintage.
review_description - A verbose review of the wine.
designation - The vineyard within the winery where the grapes that made the wine are from.
points - ratings given by the user. The ratings are between 0 -100.
price - The cost for a bottle of the wine
province - The province or state that the wine is from.
region_1 - The wine-growing area in a province or state (ie Napa).
region_2 - Sometimes there are more specific regions specified within a wine-growing area (ie Rutherford inside the Napa Valley), but this value can sometimes be blank.
winery - The winery that made the wine
variety - The type of grapes used to make the wine.

Ayush-Parhi/wine_variety_predictor

Summary of techniques used in the notebook: