Source code for AAAI 2023 paper: A Neural Span-Based Continual Named Entity Recognition Model [arxiv]
[Appendix for AAAI paper]
# Environment
- python >= 3.6
# Dependencies
- torch >= 1.8
- transformers == 4.3.0
- rich
- prettytable
- paramiko # [Option] if need to remotely open file.
We provide the pre-processed datasets in data/
DIR, which keep the same sample-task allocation for split setup experimented in the paper for reproduction.
[Update 20230525🔥] We provide the zip file of data/
as data.zip
for convenient, since the original data/
used LFS and need Github quota 😑. Please unzip it before running the code.😊
python train_clner.py \
--gpu 0 \
--m spankl \
--corpus onto \
--setup split \
--perm perm0
# gpu: which gpu to use. (-2: auto allocate gpu -1: use cpu)
# m: types of model (spankl|add|ext)
# corpus: (onto|fewnerd)
# setup: (split|filter)
# perm: task learning order, default perm0 (perm0|perm1|...)
Note that this yields results regarding one of the Task Permutation (e.g., --perm0), while Tab.1 and Tab.2 in the paper are the results averaged over all the Task Permutations.
Update 0305! #1 For convenience, we log the final metric used in the paper at each incremental step (Task) during training, as belows:
...
INFO Test All Final MacroF1 over Task-Level:{metric}
...
INFO Test Filter Final MacroF1 over Task-Level:{metric}
...
and also provide an overivew of the performance learned so far:
# Logging example when the running finishs (All tasks have been learned on OntoNotes).
======onto======
onto-0-2022-07-21_10-02-57-1824-spankl_split_perm0
***Test All***
[[87.9 0. 0. 0. 0. 0. 87.9 87.9 ]
[87.18 93.33 0. 0. 0. 0. 90.43 90.25]
[88.13 93.41 95.47 0. 0. 0. 92.61 92.34]
[87.4 93.26 95.16 85.02 0. 0. 90.71 90.21]
[87.13 92.65 95.05 84.71 81.37 0. 89.38 88.18]
[87.88 93.02 95.08 85.93 83.04 93.01 90.3 89.66]
[-0.26 -0.39 -0.39 0. 0. 0. -0.17 0. ]]
The table is organized as (Table Format):
Task1_metric Task2_metric ... MicroF1 MacroF1
Learn to Task1
Learn to Task2
...
BI (backward interference / forgetting)
The last columns are the metrics of each step used in the paper.
After and during training, an overview_metric.json
file will be generated in model_ckpt/[MODEL_INFO]/
recording the required details, and you can:
- refer to
__main__
inprint_cl_metric.py
to useprint_cl_metric()
func to remotely print results. - or refer to the end of
train_clner.py
to use a wrappedsimply_print_cl_metric()
func to locally print results.
For any questions please notice the comments in the code or contact me.
Welcome to star or raise issues and PR! :)
This project is licensed under the Apache License - see the LICENSE file for details.
If you use this work or code, please kindly cite this paper:
@inproceedings{zhang2023spankl,
title={A Neural Span-Based Continual Named Entity Recognition Model},
author={Zhang, Yunan and Chen, Qingcai},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2023}
}
or
@article{Zhang_Chen_2023,
title={A Neural Span-Based Continual Named Entity Recognition Model},
volume={37},
url={https://ojs.aaai.org/index.php/AAAI/article/view/26638},
DOI={10.1609/aaai.v37i11.26638},
number={11},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
author={Zhang, Yunan and Chen, Qingcai},
year={2023},
month={Jun.},
pages={13993-14001}
}