Continual Learning with Structured Regularization in Named Entity Recognition
- PyTorch
- numpy
- tqdm
- seqeval
- gensim
- Jupyter Notebook
To replicate the experiments:
-
Clone the repository.
-
Under the folder
experiments
, locate the specific experiments to replicate (NA - New Addition, Seq - Sequential). -
Download a copy of Google word2vec pre-trained embedding, put under "NER/checkpoints/GoogleNews-vectors-negative300.bin" or specify the path in configuration of each train script.
-
Three scripts are enlisted in each directory (Baseline vs. EWC vs. SI). Inside, "multiple_allowed" controls whether to purge sentences with multiple entity tags. Run
python3 ewc_train.py
for example, for training and evaluation.
The copy of CONLL 2003 dataset is referred here from huggingface.