/yelp_review

this is a assignment from text class, which handles the review analysis.

Primary LanguageJupyter Notebook

yelp_review

this is a TEXT MANAGEMENT assignment from class, which handles the yelp review analysis.

I. Requirement

# spacy
pip install -U spacy

# language model
python -m spacy download en_core_web_sm

# pandas seaborn tqdm pickle
pip install pandas seaborn tqdm pickle

# nltk
# https://www.nltk.org/install.html
pip install nltk
ntlk.download()

# torch
## we recommend you with the conda environment
conda install -c pytorch pytorch

# transformers
conda install -c conda-forge transformers

# lime
conda install -c conda-forge lime

# scikit learn
pip install -U scikit-learn
https://scikit-learn.org/stable/install.html

# redability
## https://pypi.org/project/py-readability-metrics/
pip install py-readability-metrics


# enchant
## https://pypi.org/project/pyenchant/
pip install pyenchant

II. Tasks

Part1

  • Subset Selection refers to the ./part1/randrom_collect.py outcome is ./data/sampled_data.csv

  • Data Analysis

    refers to the ../part1/data_analysis.py. It plots the figures about the length distribution among all the data.

  • POS Tagging

    refers to the ./part1/pos_tagging.py. It tags the reviews.

  • Indicative Adjectives refers to the ./part1/get_top10_adj.py

Part2

  • Top 5 Review Selection Model

    refers to the ./part2/review_selection_part2.py

Part3

  • Sentiment Analysis

    refers to ./part3/sentiment_q3.ipynb