Coner: A Collaborative Approach for Long-Tail Named Entity Recognition inScientific Publications

Named Entity Recognition (NER) for rare long-tail entities as e.g., often found in domain-specific scientific publications is a challenging task, as typically the extensive training data and test data for fine-tuning NER algorithms is lacking. Recent approaches presented promising solutions relying on training NER algorithms in a distantly-supervised fashion, thus limiting human interaction to only providing a small set of seed terms. However, such approaches rely on often failing heuristics, thus limiting their performance. In this paper, we therefore introduce an collaborative approach which incrementally incorporates human feedback on the relevance of extracted entities into the training cycle, therefore allowing to still train new long-tail NER extractors cheaply, but with ever increasing performance while the data is actively used. We show that with only user interaction, F-Scores and precision and recall can be increased compared to the automatic baseline approach.

vliegenthart/coner_collaborative_ner

Coner: A Collaborative Approach for Long-Tail Named Entity Recognition inScientific Publications