/peft-ner-masakhaner

Primary LanguageJupyter NotebookMIT LicenseMIT

Multilingual NLP

Introduction

This project implements fine-tuning of a multilingual transformer ("xlm-roberta-base") for the named entity recognition task (NER) in various ways. First, a simple complete fine-tuning and then a PEFT variant, more specifically BitFit. The goal is to compare the effectiveness of the implemented tweak.

Roadmap

Usage

Just run the NERRun.py script. Might need a login to your wandb account before (e.g. via the terminal) and possibly might need a predefined cache structure at a given location (see NERDataModule.py) defined via an .env at root (i.e. CACHE_DIR=...).

Issues

The project evaluates on a micro f1 basis which heavily favors the outside tag (which is not really desirable). If anyone uses the repo, they should consider using macro f1 instead (or exclude the O tag from evaluation). In my case, it is/was not really necessary as the direct comparison is sufficient to get a first impression on the effectiveness.

Also, the plot creation is also done in a messy copy paste fashion, don't mind it :P