This is a movie sentiment analysis project that we will use the IMDB dataset to train our machine learning models. Then, we use these models to classify whether a movie review is positive or negative.
- Lim Chia Chung @Jacky0111
- Lim Ming Jun @mingjun1120
- Leong Yit Wee @leongyitwee
Python 3.8
Google Colab
- Model Training (DistilBert): Ktrain
- Crawling Data: request-html
Open the NLP Assignment
folder in Google Colab
. Then, follow these steps:
- Run the
2_preprocess_data.ipynb
- Run
DistilBERT.ipynb
inside Lim Ming Jun folder - Run
Long_Short-Term_Memory(LSTM).ipynb
inside Lim Chia Chung folder - Run
Convolutional Neural Network Tutorial (CNN).ipynb
inside Leong Yit Wee folder - Run
Local Sentiment Analysis App.ipynb
For this part, go to the MovieScraper folder first and the run the main.py
(scraper). The output will
store as scrap_movie_reviews.csv
.
Basically, we use the IMDB dataset
to train on the DistilBert, LSTM and CNN models. Then, we will compare the performance of these 3 models
via Local Sentiment Analysis App.ipynb
where this notebook can do a single review analysis input by users as well as
multiple reviews analysis by feeding the data that we have scraped through main.py
in MovieScraper folder.
The scraped data are well labeled with POSITIVE
and NEGATIVE
sentiment.