/distilbert-base-uncased-finetuning

This repository contains a DistilBERT model fine-tuned using the Hugging Face Transformers library on the IMDb movie review dataset. The model is trained for sentiment analysis, enabling the determination of sentiment polarity (positive or negative) within text reviews.

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

DistilBERT Sentiment Analysis

This repository contains a DistilBERT model fine-tuned using the Hugging Face Transformers library on the IMDb movie review dataset. The model is designed for sentiment analysis, enabling the determination of sentiment polarity (positive or negative) within text reviews.

The model is based on the paper DistilBERT: a distilled version of BERT: smaller, faster, cheaper and lighter.arXiv

📝 Contents

  • dataset/: Contains scripts or code related to dataset handling and processing.
  • pretrained/: (Please manually download and place the pytorch_model.bin file from the link below)
  • predict.ipynb: Notebook demonstrating the prediction process using the fine-tuned DistilBERT model.

🤗 Pretrained Model

Please download the pre-trained model pytorch_model.bin from the following link and move it to the pretrained/ folder: Download Model

🔨 Preparation

To get started, clone the repository and navigate to the project directory:

git clone https://github.com/zyh040521/distilbert-base-uncased-finetuning
cd distilbert-base-uncased-finetuning

💡 Building the Environment

To set up the required environment, install the dependencies listed in the requirements.txt file using pip:

pip install -r requirements.txt

🌟 Usage

Run main.ipynb

😊 Predict

Run predict.ipynb

🔥 TODO

  • Use the evaluate library to assess model accuracy