This project implements fine-tuning of a multilingual transformer ("xlm-roberta-base") for the named entity recognition task (NER) in various ways. First, a simple complete fine-tuning and then a PEFT variant, more specifically BitFit. The goal is to compare the effectiveness of the implemented tweak.
- Model implementation:
- "xlm-roberta-base" as encoder/body (via
AutoModel.from_pretrained
) - Preprocessing needed
- Own classification head using cross entropy loss
- Reference: Token classification guide by huggingface
- "xlm-roberta-base" as encoder/body (via
- Train with AdamW
- Eval
- Micro F1 on last checkpoint
- Set(s):
- Reference for using multiple sets: Torch Lightning - Managing Data
Just run the NERRun.py script. Might need a login to your wandb account before (e.g. via the terminal) and possibly might need a predefined cache structure at a given location (see NERDataModule.py) defined via an .env at root (i.e. CACHE_DIR=...).
The project evaluates on a micro f1 basis which heavily favors the outside tag (which is not really desirable). If anyone uses the repo, they should consider using macro f1 instead (or exclude the O tag from evaluation). In my case, it is/was not really necessary as the direct comparison is sufficient to get a first impression on the effectiveness.
Also, the plot creation is also done in a messy copy paste fashion, don't mind it :P