awesome-chinese-ner

中文命名实体识别

大模型信息抽取

大模型信息抽取综述
Large Language Models for Generative Information Extraction: A Survey
https://arxiv.org/abs/2312.17617
https://github.com/quqxui/Awesome-LLM4IE-Papers

延申

中文预训练模型综述
https://www.jsjkx.com/CN/10.11896/jsjkx.211200018
中文预训练模型下载地址
https://github.com/lonePatient/awesome-pretrained-chinese-nlp-models
中文词向量下载地址
https://github.com/Embedding/Chinese-Word-Vectors
Bilstm_CRF怎么调参？
https://arxiv.org/pdf/1707.06799.pdf
使用chatgpt进行信息抽取（实体、关系、事件）
Zero-Shot Information Extraction via Chatting with ChatGPT
演示地址：http://124.221.16.143:5000/
https://arxiv.org/pdf/2302.10205.pdf
https://github.com/cocacola-lab/ChatIE
GPT for Information Extraction
https://github.com/cocacola-lab/GPT4IE
Evaluation-of-ChatGPT-on-Information-Extraction
https://github.com/RidongHan/Evaluation-of-ChatGPT-on-Information-Extraction
这篇把它放在延申这里：
Unified Text Structuralization with Instruction-tuned Language Models
2023
https://arxiv.org/pdf/2303.14956v2.pdf
GPT-NER: Named Entity Recognition via Large Language Models
2023
https://arxiv.org/pdf/2304.10428v1.pdf
https://github.com/ShuheWang1998/GPT-NER
EasyInstruct: An Easy-to-use Framework to Instruct Large Language Models
https://github.com/zjunlp/EasyInstruct
CODEIE: Large Code Generation Models are Better Few-Shot Information Extractors
在代码中进行实体和关系的提取
2023
https://arxiv.org/pdf/2305.05711v1.pdf
https://github.com/dasepli/CodeIE
PromptNER : Prompting For Named Entity Recognition
2023
https://arxiv.org/pdf/2305.15444v2.pdf

命名实体识别综述（中文）

基于深度学习的中文命名实体识别最新研究进展综述
2022年中文信息学报
http://61.175.198.136:8083/rwt/125/http/GEZC6MJZFZZUPLSSGM3B/Qikan/Article/Detail?id=7107633068
命名实体识别方法研究综述
2022年计算机科学与探索
http://fcst.ceaj.org/CN/10.3778/j.issn.1673-9418.2112109
中文命名实体识别综述
2021年计算机科学与探索
http://fcst.ceaj.org/CN/abstract/abstract2902.shtml
Chinese named entity recognition: The state of the art
Neurocomputing 2022
link

模型

Chinese Sequence Labeling with Semi-Supervised Boundary-Aware Language Model Pre-training
COLING 2024
https://arxiv.org/pdf/2404.05560
Unified Lattice Graph Fusion for Chinese Named Entity Recognition
2024
https://arxiv.org/pdf/2312.16917.pdf
MRC-based Nested Medical NER with Co-prediction and Adaptive Pre-training
2024 医疗实体识别
https://arxiv.org/pdf/2403.15800.pdf
CHisIEC: An Information Extraction Corpus for Ancient Chinese History
2024 文言文实体识别
https://arxiv.org/pdf/2403.15088.pdf
https://github.com/tangxuemei1995/CHisIEC
Attack Named Entity Recognition by Entity Boundary Interference
2023
https://arxiv.org/pdf/2305.05253v1.pdf
Token Relation Aware Chinese Named Entity Recognition
ACM Transactions on Asian and Low-Resource Language Information Processing 2023
https://dl.acm.org/doi/10.1145/3531534
WYWEB: A NLP Evaluation Benchmark For Classical Chinese
ACL2023
https://arxiv.org/pdf/2305.14150
https://github.com/baudzhou/WYWEB
PUnifiedNER: a Prompting-based Unified NER System for Diverse Datasets
AAAI 2023
https://arxiv.org/pdf/2211.14838.pdf
https://github.com/GeorgeLuImmortal/PUnifiedNER
END-TO-END ENTITY DETECTION WITH PROPOSER ANDREGRESSOR
借鉴目标检测的**
2022
https://arxiv.org/pdf/2210.10260v2.pdf
https://github.com/Rosenberg37/EntityDetection
DAMO-NLP at SemEval-2022 Task 11:A Knowledge-based System for Multilingual Named Entity Recognition
多语言的命名实体识别
2022
https://arxiv.org/pdf/2203.00545.pdf
https://github.com/Alibaba-NLP/KB-NER
PCBERT: Parent and Child BERT for Chinese Few-shot NER
COLING 2022
https://aclanthology.org/2022.coling-1.192.pdf
GNN-SL: Sequence Labeling Based on Nearest Examples via GNN
2022
https://arxiv.org/pdf/2212.02017.pdf
https://github.com/ShuheWang1998/GNN-SL
EiCi: A New Method of Dynamic Embedding Incorporating Contextual Information in Chinese NER
这个和AMBERT的**感觉差不多：AMBERT
2022
https://openreview.net/pdf?id=0TKg4UlnEEQ
Deep Span Representations for Named Entity Recognition
2022
https://arxiv.org/pdf/2210.04182v1.pdf
Mulco: Recognizing Chinese Nested Named Entities Through Multiple Scopes
2022
https://arxiv.org/pdf/2211.10854.pdf
Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling
EMNLP 2022
https://arxiv.org/pdf/2210.15231.pdf
http://github.com/modelscope/adaseq/examples/babert
Domain-Specific NER via Retrieving Correlated Samples
COLING 2022
https://arxiv.org/pdf/2208.12995.pdf
Two Languages Are Better than One: Bilingual Enhancement for Chinese Named Entity Recognition
COLING 2022
https://aclanthology.org/2022.coling-1.176.pdf
A hybrid Transformer approach for Chinese NER with features augmentation
Expert Syst. Appl 2022
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4087645
Adaptive Threshold Selective Self-Attention for Chinese NER
COLING 2022
https://aclanthology.org/2022.coling-1.157.pdf
Improving Chinese Named Entity Recognition by Search Engine Augmentation
2022
https://arxiv.org/pdf/2210.12662.pdf
Domain-Specific NER via Retrieving Correlated Samples
COLING 2022
https://arxiv.org/pdf/2208.12995.pdf
Robust Self-Augmentation for Named Entity Recognition with Meta Reweighting
NAACL 2022
https://arxiv.org/pdf/2204.11406.pdf
https://github.com/LindgeW/MetaAug4NER
Boundary Smoothing for Named Entity Recognition
ACL 2022
https://arxiv.org/pdf/2204.12031v1.pdf
https://github.com/syuoni/eznlp
NFLAT: Non-Flat-Lattice Transformer for Chinese Named Entity Recognition
2022
https://arxiv.org/pdf/2205.05832.pdf
Unified Structure Generation for Universal Information Extraction
（一统实体识别、关系抽取、事件抽取、情感分析），百度UIE
ACL 2022
https://arxiv.org/pdf/2203.12277.pdf
https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/uie
https://github.com/universal-ie/UIE
以下这篇也是通用的，只是英文方面的，没有中文数据上的实验：
- DEEPSTRUCT: Pretraining of Language Models for Structure Prediction
  2022
  https://arxiv.org/pdf/2205.10475v1.pdf
  https://github.com/cgraywang/deepstruct
Parallel Instance Query Network for Named Entity Recognition
2022
https://arxiv.org/pdf/2203.10545v1.pdf
Delving Deep into Regularity: A Simple but Effective Method for Chinese Named Entity Recognition
NAACL 2022
https://arxiv.org/pdf/2204.05544.pdf
TURNER: The Uncertainty-based Retrieval Framework for Chinese NER
2022
https://arxiv.org/pdf/2202.09022
NN-NER: Named Entity Recognition with Nearest Neighbor Search
2022
https://arxiv.org/pdf/2203.17103
https://github.com/ShannonAI/KNN-NER
Unified Named Entity Recognition as Word-Word Relation Classification
AAAI 2022
https://arxiv.org/abs/2112.10070
https://github.com/ljynlp/W2NER.git
MarkBERT: Marking Word Boundaries Improves Chinese BERT
2022
https://arxiv.org/pdf/2203.06378
MFE-NER: Multi-feature Fusion Embedding for Chinese Named Entity Recognition
2021
https://arxiv.org/pdf/2109.07877
AdaK-NER: An Adaptive Top-K Approach for Named Entity Recognition with Incomplete Annotations
2021
https://arxiv.org/pdf/2109.05233
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information
ACL 2021
https://arxiv.org/pdf/2106.16038
https://github.com/ShannonAI/ChineseBert
Enhanced Language Representation with Label Knowledge for Span Extraction
EMNLP 2021
https://aclanthology.org/2021.emnlp-main.379.pdf
https://github.com/Akeepers/LEAR
Lex-BERT: Enhancing BERT based NER with lexicons
ICLR 2021
https://arxiv.org/pdf/2101.00396v1.pdf
Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter
ACL 2021
https://arxiv.org/pdf/2105.07148.pdf
https://github.com/liuwei1206/LEBERT
MECT: Multi-Metadata Embedding based Cross-Transformer for Chinese Named Entity Recognition
ACL 2021
https://arxiv.org/pdf/2107.05418v1.pdf
https://github.com/CoderMusou/MECT4CNER
Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition
ACL 2021
https://arxiv.org/pdf/2105.06804v2.pdf
https://github.com/tricktreat/locate-and-label
Dynamic Modeling Cross- and Self-Lattice Attention Network for Chinese NER
AAAI 2021
https://ojs.aaai.org/index.php/AAAI/article/view/17706/17513
https://github.com/zs50910/DCSAN-for-Chinese-NER
Improving Named Entity Recognition with Attentive Ensemble of Syntactic Information
EMNLP-2020
https://arxiv.org/pdf/2010.15466
https://github.com/cuhksz-nlp/AESINER
ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations
ACL 2020
https://arxiv.org/pdf/1911.00720v1.pdf
https://github.com/sinovation/ZEN
A Unified MRC Framework for Named Entity Recognition
ACL 2020
https://arxiv.org/pdf/1910.11476v6.pdf
https://github.com/ShannonAI/mrc-for-flat-nested-ner
Simplify the Usage of Lexicon in Chinese NER
ACL 2020
https://arxiv.org/pdf/1908.05969.pdf
https://github.com/v-mipeng/LexiconAugmentedNER
A Boundary Regression Model for Nested Named Entity Recognition
2020
https://arxiv.org/pdf/2011.14330v3.pdf
https://github.com/yuelfei/BR
Dice Loss for Data-imbalanced NLP Tasks
ACL 2020
https://arxiv.org/pdf/1911.02855v3.pdf
https://github.com/ShannonAI/dice_loss_for_NLP
Porous Lattice Transformer Encoder for Chinese NER
COLING 2020
https://aclanthology.org/2020.coling-main.340.pdf
FLAT: Chinese NER Using Flat-Lattice Transformer
ACL 2020
https://arxiv.org/pdf/2004.11795v2.pdf
https://github.com/LeeSureman/Flat-Lattice-Transformer
FGN: Fusion Glyph Network for Chinese Named Entity Recognition
2020
https://arxiv.org/pdf/2001.05272v6.pdf
https://github.com/AidenHuen/FGN-NER
SLK-NER: Exploiting Second-order Lexicon Knowledge for Chinese NER
2020
https://arxiv.org/pdf/2007.08416v1.pdf
https://github.com/zerohd4869/SLK-NER
Entity Enhanced BERT Pre-training for Chinese NER
EMNLP 2020
https://aclanthology.org/2020.emnlp-main.518.pdf
https://github.com/jiachenwestlake/Entity_BERT
Improving Named Entity Recognition with Attentive Ensemble of Syntactic Information
ACL2020
https://arxiv.org/pdf/2010.15466v1.pdf
https://github.com/cuhksz-nlp/AESINER
Named Entity Recognition for Social Media Texts with Semantic Augmentation
EMNLP 2020
https://arxiv.org/pdf/2010.15458v1.pdf
https://github.com/cuhksz-nlp/SANER
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese
2020
https://arxiv.org/pdf/2001.04351v4.pdf
https://github.com/CLUEbenchmark/CLUENER2020
ERNIE: Enhanced Representation through Knowledge Integration
2019
https://arxiv.org/pdf/1904.09223v1.pdf
https://github.com/PaddlePaddle/ERNIE
TENER: Adapting Transformer Encoder for Named Entity Recognition
2019
https://arxiv.org/pdf/1911.04474v3.pdf
https://github.com/fastnlp/TENER
Chinese NER Using Lattice LSTM
ACL 2018
https://arxiv.org/pdf/1805.02023v4.pdf
https://github.com/jiesutd/LatticeLSTM
ERNIE 2.0: A Continual Pre-training Framework for Language Understanding
2019
https://arxiv.org/pdf/1907.12412v2.pdf
https://github.com/PaddlePaddle/ERNIE
Glyce: Glyph-vectors for Chinese Character Representations
NeurIPS 2019
https://arxiv.org/pdf/1901.10125v5.pdf
https://github.com/ShannonAI/glyce
CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition
NAACL 2019
https://arxiv.org/pdf/1904.02141v3.pdf
https://github.com/microsoft/vert-papers/tree/master/papers/CAN-NER
Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation
2019
https://arxiv.org/pdf/1905.01964v1.pdf
https://github.com/rxy007/cnn-lstm-crf
Chinese Named Entity Recognition Augmented with Lexicon Memory
2019
https://arxiv.org/pdf/1912.08282v2.pdf
https://github.com/dugu9sword/LEMON
Exploiting Multiple Embeddings for Chinese Named Entity Recognition
2019
https://arxiv.org/pdf/1908.10657v1.pdf
https://github.com/WHUIR/ME-CNER
Dependency-Guided LSTM-CRF for Named Entity Recognition
IJCNLP 2019
https://arxiv.org/pdf/1909.10148v1.pdf
https://github.com/allanj/ner_with_dependency
CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition
NAACL-HLT (1) 2019
https://aclanthology.org/N19-1342/
CNN-Based Chinese NER with Lexicon Rethinking
IJCAI 2019
https://www.ijcai.org/proceedings/2019/0692.pdf
https://aclanthology.org/N19-1342.pdf
Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network
IJCNLP 2019
https://aclanthology.org/D19-1396.pdf
https://github.com/DianboWork/Graph4CNER
Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning
COLING 2018
https://aclanthology.org/C18-1183.pdf
https://github.com/rainarch/DSNER
Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism
EMNLP 2018
https://aclanthology.org/D18-1017.pdf
https://github.com/CPF-NLPR/AT4ChineseNER

非中文模型

没有针对于中文的实验，但是**可以借鉴的：

DiffusionNER: Boundary Diffusion for Named Entity Recognition
2023
https://arxiv.org/pdf/2305.13298v1.pdf
https://github.com/tricktreat/DiffusionNER
Learning In-context Learning for Named Entity Recognition
ACL 2023
https://arxiv.org/pdf/2305.11038v1.pdf
https://github.com/chen700564/metaner-icl
UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective
2023
https://arxiv.org/pdf/2305.10306v1.pdf
Easy-to-Hard Learning for Information Extraction∗
2023
https://arxiv.org/pdf/2305.09193v1.pdf
https://github.com/DAMO-NLP-SG/IE-E2H
UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction
2023
https://openreview.net/pdf?id=cRQwl-59CU8
https://github.com/yhcc/utcie
Deep Span Representations for Named Entity Recognition
Boundary Smoothing for Named Entity Recognition(同作者)
ACL 2023
https://github.com/syuoni/eznlp
https://arxiv.org/pdf/2210.04182v2.pdf
NER-to-MRC: Named-Entity Recognition Completely Solving as Machine Reading Comprehension
2023
https://arxiv.org/pdf/2305.03970v1.pdf
RexUIE: A Recursive Method with Explicit Schema Instructor for Universal Information Extraction
通用信息抽取，对比USM
2023
https://arxiv.org/pdf/2304.14770.pdf
InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction
（又一篇通用信息抽取，对比百度UIE以及USM）
2023
https://arxiv.org/pdf/2304.08085v1.pdf
https://github.com/BeyonderXX/InstructUIE
Universal Information Extraction as Unified Semantic Matching
通用的信息抽取：实体、关系、事件（没有在中文数据上的实验），简称USM
AAAI 2023
https://arxiv.org/pdf/2301.03282.pdf
MULTI-TASK TRANSFORMER WITH RELATION-ATTENTION AND TYPE-ATTENTION FOR NAMED ENTITY RECOGNITION
2023
https://arxiv.org/pdf/2303.10870v1.pdf
DEEPSTRUCT: Pretraining of Language Models for Structure Prediction
通用信息抽取
ACL 2022
https://arxiv.org/pdf/2205.10475v2.pdf
https://github.com/cgraywang/deepstruct
TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags
改进W2NER模型
IEEE TASLP(Transactions on Audio, Speech and Language Processing)
https://arxiv.org/pdf/2211.00684.pdf
https://github.com/solkx/TOE
OPTIMIZING BI-ENCODER FOR NAMED ENTITY RECOGNITION VIA CONTRASTIVE LEARNING
ICLR 2023
https://arxiv.org/pdf/2208.14565v2.pdf
github.com/microsoft/binder
One Model for All Domains: Collaborative Domain-Prefix Tuning for Cross-Domain NER
2023
https://arxiv.org/pdf/2301.10410v2.pdf
https://github.com/zjunlp/DeepKE/tree/main/example/ner/cross
QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition
2022
https://arxiv.org/pdf/2203.01543.pdf
A Unified Generative Framework for Various NER Subtasks
（使用BART生成模型进行命名实体识别）
ACL-ICJNLP 2021
https://arxiv.org/pdf/2106.01223.pdf
https://github.com/yhcc/BARTNER
(以下四篇是基于prompt的命名实体识别)
Template-Based Named Entity Recognition Using BART
https://arxiv.org/abs/2106.01760
https://github.com/Nealcly/templateNER
Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER
https://arxiv.org/abs/2110.08454
https://github.com/INK-USC/fewNER
LightNER: A Lightweight Generative Framework with Prompt-guided Attention for Low-resource NER
https://arxiv.org/abs/2109.00720
https://github.com/zjunlp/DeepKE/blob/main/example/ner/few-shot/README_CN.md
Template-free Prompt Tuning for Few-shot NER
https://arxiv.org/abs/2109.13532
https://github.com/rtmaww/EntLM/

数据集

预训练语言模型

Ner工具

Stanza
LAC
Ltp 哈工大
Hanlp
foolnltk
NLTK
BosonNLP
FudanNlp 复旦大学
Jionlp
HarvestText
fastHan
EsayNLP 阿里巴巴
PaddleNLP 百度
AliceMind 阿里巴巴
spacy
DeepKE
coreNlp JAVA/Python
opennlp JAVA
NLPIR
trankit 多语言
HugIE 通用信息抽取
EasyInstruct

比赛

CCKS2017开放的中文的电子病例测评相关的数据。
评测任务一：https://biendata.com/competition/CCKS2017_1/
评测任务二：https://biendata.com/competition/CCKS2017_2/
CCKS2018开放的音乐领域的实体识别任务。
评测任务：https://biendata.com/competition/CCKS2018_2/
(CoNLL 2002)Annotated Corpus for Named Entity Recognition。
地址：https://www.kaggle.com/abhinavwalia95/entity-annotated-corpus
NLPCC2018开放的任务型对话系统中的口语理解评测。
地址：http://tcci.ccf.org.cn/conference/2018/taskdata.php
非结构化商业文本信息中隐私信息识别
地址：https://www.datafountain.cn/competitions/472/datasets
商品标题识别
地址：https://www.heywhale.com/home/competition/620b34ed28270b0017b823ad/content/3
CCKS2021中文NLP地址要素解析
地址：https://tianchi.aliyun.com/competition/entrance/531900/introduction
CAIL2022信息抽取赛道
地址：http://cail.cipsc.org.cn/task6.html?raceID=6&cail_tag=2022
2019互联网金融新实体发现
2020CHIP-中药说明书实体识别挑战
2020CHIP-中文医学文本命名实体识别
2020CCKS面向试验鉴定的命名实体识别
2020CCKS面向中文电子病历的医疗实体及事件抽取-子任务1：医疗命名实体识别
LAIC2022-犯罪事实实体识别
SemEval-2023 Task 2: Fine-grained Multilingual Named Entity Recognition (MultiCoNER 2)
新型电力系统人工智能应用大赛赛题二：电力生产知识图谱多模式信息抽取
CCKS2022通用信息抽取

longaqiqi/awesome-chinese-ner

awesome-chinese-ner

大模型信息抽取

延申

命名实体识别综述（中文）

模型

非中文模型

数据集

预训练语言模型

Ner工具

比赛