This Jupyter Notebook aims to perform sentiment analysis on movie reviews by building a binary classification model using the Bag of Words technique. The model will predict whether a given movie review expresses positive or negative sentiment.
The dataset used for this project consists of a collection of movie reviews labeled with sentiment (positive or negative). Each review is preprocessed and represented as a bag of words, where each word represents a feature.
The dataset is split into two subsets: a training set and a test set. The training set is used to train the model, while the test set is used to evaluate its performance.
The following dependencies are required to run the Jupyter Notebook:
- Python 3.x
- NumPy
- pandas
- scikit-learn
- Jupyter Notebook
- nltk
- re
It is recommended to set up a virtual environment to keep the project dependencies isolated.
The project structure consists of a single Jupyter Notebook file:
movie_sentiment_analysis.ipynb
README.md
movie_sentiment_analysis.ipynb
is the Jupyter Notebook file containing the code for data preprocessing, model training, evaluation, and prediction.README.md
provides information about the project and instructions for running it.
-
Clone the repository:
git clone https://github.com/your-username/movie_sentiment_analysis.git
-
Navigate to the project directory:
cd movie_sentiment_analysis
-
Activate your virtual environment (if used) and launch Jupyter Notebook:
jupyter notebook
-
Open the
movie_sentiment_analysis.ipynb
notebook in your browser. -
Follow the instructions and execute the cells in the notebook step by step to preprocess the data, train the sentiment classification model, evaluate its performance, and make predictions on new movie reviews.
-
Modify the notebook as needed to suit your specific requirements.
This Movie Sentiment Analysis project provides a Jupyter Notebook implementation of binary classification using the Bag of Words technique. By following the instructions above and executing the cells in the notebook, you can preprocess the data, train a model, evaluate its performance, and make predictions on new movie reviews. Feel free to modify and extend the notebook to suit your specific needs.