This repository implements a Multi-Grained Named Entity Recognition model proposed in this paper with Tensorflow:
Congying Xia, Chenwei Zhang, Tao Yang, Yaliang Li, Nan Du, Xian Wu, Wei Fan, Fenglong Ma, Philip Yu. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019.
https://arxiv.org/abs/1906.08449
Python
Tensorflow 1.13.1
AllenNLP
Numpy
Sklearn
The dataset content is subject to copyright issues. Some useful pointers:
(ACE 2004) https://catalog.ldc.upenn.edu/LDC2005T09
(ACE 2005) https://catalog.ldc.upenn.edu/LDC2006T06
(CoNLL 2003) https://cogcomp.seas.upenn.edu/page/resource_view/81
Two types of format are needed. The first example can be found at data/ace2004/train.txt. The second examples can be found at data/ace2004/elmo/sentences/train_sentences.
Please generate formated files for the whole datasets and replace the example files with your generated files, including:
data/ace2004/train.txt
data/ace2004/test.txt
data/ace2004/dev.txt
data/ace2004/elmo/sentences/train_sentences
data/ace2004/elmo/sentences/test_sentences
data/ace2004/elmo/sentences/dev_sentences
Download glove.6B.zip from https://nlp.stanford.edu/projects/glove/.
Unzip it and put glove.6B.300d.txt under the directory of data/glove.6B/
Run the script to download Elmo embeddings.
cd data/ace2004/elmo
sh run.sh
Three files should be generated:
elmo_dev.hdf5
elmo_test.hdf5
elmo_train.hdf5
cd detector
python build_data.py
It should generate files including
data/glove.6B.300d.trimmed.npz
data/chars.txt
data/tags.txt
data/words.txt
python train.py
python dump.py
Dumped Features will be saved in data/saved_roi/
If you'd like to evaluate the performance of the Detector, please run the evaluate.py
python evaluate.py
python train.py
python evaluate.py
If you find our code useful, please cite our paper.
@article{xia2019multi,
title={Multi-Grained Named Entity Recognition},
author={Xia, Congying and Zhang, Chenwei and Yang, Tao and Li, Yaliang and Du, Nan and Wu, Xian and Fan, Wei and Ma, Fenglong and Yu, Philip},
journal={arXiv preprint arXiv:1906.08449},
year={2019}
}