I have classified multi-label texts from a Kaggle Competition with PyTorch Lightning. This was done with the BERT-base model from the HuggingFace Transformers library and fine-tuned on the above dataset with Lightning.
BERT-base model fine-tuned on our custom dataset giving an average F1-score of 0.70. We know this is due to the model making mistakes on the tags with low samples.
- Python
- 🤗 Transformers
- PyTorch Lightning
- Pandas
- NumPy
- Sklearn
- Tokenized text (with BERT tokenizer) and created PyTorch dataset
- Fine-tune BERT model with PyTorch Lightning
- Made predictions using the fine-tuned BERT model
- Evaluate the performance of the model for each class