/zomato-bangalore

Analysis and predictive modelling of Zomato Bangalore data, including customer reviews.

Primary LanguageJupyter Notebook

zomato-bangalore

Analysis and predictive modelling of Zomato Bangalore data, including customer reviews. The folder Python_scripts_and_notebooks contains .py and .ipynb files for my 3-part analysis.

This project was executed on Kaggle. The Zomato Bangalore dataset is publicly available at https://www.kaggle.com/himanshupoddar/zomato-bangalore-restaurants. Word2Vec embeddings used in Part 3 are available at https://www.kaggle.com/sandreds/googlenewsvectorsnegative300.

Part 1 - EDA and Regression

  • Data cleaning (identifying and dropping duplicates, reformatting features)
  • Exploratory Data Analysis and observations
  • Data visualizations
  • Preprocessing and prediction with regression models
  • Model evaluation (MSE, MAPE, R^2)
  • Results summary

Part 2 - Ratings Classification

  • Target transformation from numeric to categorical
  • Preprocessing and prediction with Decision Tree, Random Forest and XGBoost
  • Model evaluation (Accuracy, Cohen Kappa, F1 score, Precision, Recall)
  • Feature Importance visualization
  • Results summary

Part 3 - Text Mining and Neural Networks

  • Text mining and insights (unigrams, bigrams, trigrams and FreqDist plots)
  • Text processing (regex, tokenizing, stopword removal, lemmatizing, vectorizing with Word2Vec)
  • Building an LSTM Neural Network
  • Model evaluation
  • Results summary