/pytorch-TFPNER

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Primary LanguageJupyter Notebook

TFPNER

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Named entity recognition (NER), which aims at identifying real-world entity mentions from texts, is a fundamental task in natural language processing with a wide range of applications. Previous approaches mainly focus on the original pure sentence but the Part of speech (POS) contains rich semantic information and contribute to the success of the Natural Language Processing task. The baseline is the BERT model with the original token. To further improve the performance of the NER task, we proposed the five methods that employed POS tags fused with the original tokens based on the BERT model to achieve the NER task, including concatenating token and POS as one or two sentences, adding POS embedding as one of the embedding elements, model ensemble, and conduct the multi-attention between the token representations and POS representations. In this work, we addressed the CoNLL-2003 and Groningen Meaning Bank (GMB) datasets which can provide both NER tags and POS tags. From our experiments on two datasets, part of the proposed methods can show performance improvement in comparison with the baseline methods.

Model

Here is the model we built to get the higher performance

Token + POS

This is an image

Token + [SEP] +POS

This is an image

POS Embedding Layer

This is an image

Token POS Attention

This is an image

Model Ensemble

This is an image

Result

The experimental result on CoNLL-2003 dataset

This is an image