/awesome-bert

bert nlp papers, applications and github resources, including the newst xlnet , BERT、XLNet 相关论文和 github 项目

This repository is to collect BERT related resources.

AD: a repository for graph convolutional networks at https://github.com/Jiakui/awesome-gcn (resources for graph convolutional networks (图卷积神经网络相关资源)).

Papers:

  1. arXiv:1810.04805, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , Authors: Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
Click to see more
  1. arXiv:1812.06705, Conditional BERT Contextual Augmentation, Authors: Xing Wu, Shangwen Lv, Liangjun Zang, Jizhong Han, Songlin Hu

  2. arXiv:1812.03593, SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering, Authors: Chenguang Zhu, Michael Zeng, Xuedong Huang

  3. arXiv:1901.02860, Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context, Authors: Zihang Dai, Zhilin Yang, Yiming Yang, William W. Cohen, Jaime Carbonell, Quoc V. Le and Ruslan Salakhutdinov.

  4. arXiv:1901.04085, Passage Re-ranking with BERT, Authors: Rodrigo Nogueira, Kyunghyun Cho

  5. arXiv:1902.02671, BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning, Authors: Asa Cooper Stickland, Iain Murray

  6. arXiv:1904.02232, BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis, Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu, [code]

Github Repositories:

official implement:

  1. google-research/bert, officical TensorFlow code and pre-trained models for BERT ,

implement of BERT besides tensorflow:

  1. codertimo/BERT-pytorch, Google AI 2018 BERT pytorch implementation,

  2. huggingface/pytorch-pretrained-BERT, A PyTorch implementation of Google AI's BERT model with script to load Google's pre-trained models,

  3. dmlc/gluon-nlp, Gluon + MXNet implementation that reproduces BERT pretraining and finetuning on GLUE benchmark, SQuAD, etc,

  4. dbiir/UER-py, UER-py is a toolkit for pre-training on general-domain corpus and fine-tuning on downstream task. UER-py maintains model modularity and supports research extensibility. It facilitates the use of different pre-training models (e.g. BERT), and provides interfaces for users to further extend upon.

  5. BrikerMan/Kashgari, Simple, Keras-powered multilingual NLP framework, allows you to build your models in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS) and text classification tasks. Includes BERT, GPT-2 and word2vec embedding.

Click to see more
  1. Separius/BERT-keras, Keras implementation of BERT with pre-trained weights,

  2. soskek/bert-chainer, Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding",

  3. innodatalabs/tbert, PyTorch port of BERT ML model

  4. guotong1988/BERT-tensorflow, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

  5. dreamgonfly/BERT-pytorch, PyTorch implementation of BERT in "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

  6. CyberZHG/keras-bert, Implementation of BERT that could load official pre-trained models for feature extraction and prediction

  7. soskek/bert-chainer, Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

  8. MaZhiyuanBUAA/bert-tf1.4.0, bert-tf1.4.0

  9. dhlee347/pytorchic-bert, Pytorch Implementation of Google BERT,

  10. kpot/keras-transformer, Keras library for building (Universal) Transformers, facilitating BERT and GPT models,

  11. miroozyx/BERT_with_keras, A Keras version of Google's BERT model,

  12. conda-forge/pytorch-pretrained-bert-feedstock, A conda-smithy repository for pytorch-pretrained-bert. ,

  13. Rshcaroline/BERT_Pytorch_fastNLP, A PyTorch & fastNLP implementation of Google AI's BERT model.

  14. nghuyong/ERNIE-Pytorch, ERNIE Pytorch Version,

improvement over BERT:

  1. thunlp/ERNIE, Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities", imporove bert with heterogeneous information fusion.

  2. PaddlePaddle/LARK, LAnguage Representations Kit, PaddlePaddle implementation of BERT. It also contains an improved version of BERT, ERNIE, for chinese NLP tasks. BERT 的中文改进版 ERNIE,

  3. ymcui/Chinese-BERT-wwm, Pre-Training with Whole Word Masking for Chinese BERT https://arxiv.org/abs/1906.08101,

  4. zihangdai/xlnet, XLNet: Generalized Autoregressive Pretraining for Language Understanding,

  5. kimiyoung/transformer-xl, Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context, This repository contains the code in both PyTorch and TensorFlow for our paper.

  6. GaoPeng97/transformer-xl-chinese, transformer xl在中文文本生成上的尝试。(transformer xl for text generation of chinese),

other resources for BERT:

  1. brightmart/bert_language_understanding, Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN,

  2. Y1ran/NLP-BERT--ChineseVersion, 谷歌自然语言处理模型BERT:论文解析与python代码,

Click to see more
  1. yangbisheng2009/cn-bert, BERT在中文NLP的应用, 语法检查

  2. JayYip/bert-multiple-gpu, A multiple GPU support version of BERT,

  3. HighCWu/keras-bert-tpu, Implementation of BERT that could load official pre-trained models for feature extraction and prediction on TPU,

  4. Willyoung2017/Bert_Attempt, PyTorch Pretrained Bert,

  5. Pydataman/bert_examples, some examples of bert, run_classifier.py 是基于谷歌bert实现了Quora Insincere Questions Classification二分类比赛。run_ner.py是基于瑞金医院AI大赛 第一赛季数据和bert写的一个命名实体识别。

  6. guotong1988/BERT-chinese, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 中文 汉语

  7. zhongyunuestc/bert_multitask, 多任务task

  8. Microsoft/AzureML-BERT, End-to-end walk through for fine-tuning BERT using Azure Machine Learning ,

  9. bigboNed3/bert_serving, export bert model for serving,

  10. yoheikikuta/bert-japanese, BERT with SentencePiece for Japanese text.

  11. whqwill/seq2seq-keyphrase-bert, add BERT to encoder part for https://github.com/memray/seq2seq-keyphrase-pytorch,

  12. algteam/bert-examples, bert-demo,

  13. cedrickchee/awesome-bert-nlp, A curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.

  14. cnfive/cnbert, 中文注释一下bert代码功能,

  15. brightmart/bert_customized, bert with customized features,

  16. JayYip/bert-multitask-learning, BERT for Multitask Learning,

  17. yuanxiaosc/BERT_Paper_Chinese_Translation, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 论文的中文翻译。Chinese Translation! https://yuanxiaosc.github.io/2018/12/…,

  18. yaserkl/BERTvsULMFIT, Comparing Text Classification results using BERT embedding and ULMFIT embedding,

  19. kpot/keras-transformer, Keras library for building (Universal) Transformers, facilitating BERT and GPT models,

  20. 1234560o/Bert-model-code-interpretation, 解读tensorflow版本Bert中modeling.py数据流

  21. cdathuraliya/bert-inference, A helper class for Google BERT (Devlin et al., 2018) to support online prediction and model pipelining.

  22. gameofdimension/java-bert-predict, turn bert pretrain checkpoint into saved model for a feature extracting demo in java

  23. 1234560o/Bert-model-code-interpretation, 解读tensorflow版本Bert中modeling.py数据流

domain specific BERT:

  1. allenai/scibert, A BERT model for scientific text. https://arxiv.org/abs/1903.10676,

  2. MeRajat/SolvingAlmostAnythingWithBert, BioBert Pytorch

  3. kexinhuang12345/clinicalBERT, ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission https://arxiv.org/abs/1904.05342

  4. EmilyAlsentzer/clinicalBERT, repository for Publicly Available Clinical BERT Embeddings

BERT Deploy Tricks:

  1. zhihu/cuBERT, Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL

  2. xmxoxo/BERT-train2deploy, Bert Model training and deploy, BERT模型从训练到部署,

  3. https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT, BERT For TensorFlow, This repository provides a script and recipe to train BERT to achieve state of the art accuracy, and is tested and maintained by NVIDIA.

BERT QA & RC task:

  1. sogou/SMRCToolkit, This toolkit was designed for the fast and efficient development of modern machine comprehension models, including both published models and original prototypes.,

  2. benywon/ChineseBert, This is a chinese Bert model specific for question answering,

  3. matthew-z/R-net, R-net in PyTorch, with BERT and ELMo,

  4. nyu-dl/dl4marco-bert, Passage Re-ranking with BERT,

  5. xzp27/BERT-for-Chinese-Question-Answering,

  6. chiayewken/bert-qa, BERT for question answering starting with HotpotQA,

  7. ankit-ai/BertQA-Attention-on-Steroids, BertQA - Attention on Steroids,

  8. NoviScl/BERT-RACE, This work is based on Pytorch implementation of BERT (https://github.com/huggingface/pytorch-pretrained-BERT). I adapted the original BERT model to work on multiple choice machine comprehension.

  9. eva-n27/BERT-for-Chinese-Question-Answering,

  10. allenai/allennlp-bert-qa-wrapper, This is a simple wrapper on top of pretrained BERT based QA models from pytorch-pretrained-bert to make AllenNLP model archives, so that you can serve demos from AllenNLP.

  11. edmondchensj/ChineseQA-with-BERT, EECS 496: Advanced Topics in Deep Learning Final Project: Chinese Question Answering with BERT (Baidu DuReader Dataset)

  12. graykode/toeicbert, TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.,

  13. graykode/KorQuAD-beginner, https://github.com/graykode/KorQuAD-beginner

  14. krishna-sharma19/SBU-QA, This repository uses pretrain BERT embeddings for transfer learning in QA domain

BERT classification task:

  1. zhpmatrix/Kaggle-Quora-Insincere-Questions-Classification, Kaggle新赛(baseline)-基于BERT的fine-tuning方案+基于tensor2tensor的Transformer Encoder方案

  2. maksna/bert-fine-tuning-for-chinese-multiclass-classification, use google pre-training model bert to fine-tuning for the chinese multiclass classification

  3. NLPScott/bert-Chinese-classification-task, bert中文分类实践,

  4. Socialbird-AILab/BERT-Classification-Tutorial,

  5. fooSynaptic/BERT_classifer_trial, BERT trial for chinese corpus classfication

  6. xiaopingzhong/bert-finetune-for-classfier, 微调BERT模型,同时构建自己的数据集实现分类

  7. pengming617/bert_classification, 利用bert预训练的中文模型进行文本分类,

  8. xieyufei1993/Bert-Pytorch-Chinese-TextClassification, Pytorch Bert Finetune in Chinese Text Classification,

  9. liyibo/text-classification-demos, Neural models for Text Classification in Tensorflow, such as cnn, dpcnn, fasttext, bert ...,

  10. circlePi/BERT_Chinese_Text_Class_By_pytorch, A Pytorch implements of Chinese text class based on BERT_Pretrained_Model,

  11. kaushaltrivedi/bert-toxic-comments-multilabel, Multilabel classification for Toxic comments challenge using Bert,

  12. lonePatient/BERT-chinese-text-classification-pytorch, This repo contains a PyTorch implementation of a pretrained BERT model for text classification.,

BERT Sentiment Analysis

  1. Chung-I/Douban-Sentiment-Analysis, Sentiment Analysis on Douban Movie Short Comments Dataset using BERT.

  2. lynnna-xu/bert_sa, bert sentiment analysis tensorflow serving with RESTful API

  3. HSLCY/ABSA-BERT-pair, Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence (NAACL 2019) https://arxiv.org/abs/1903.09588,

  4. songyouwei/ABSA-PyTorch, Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。,

  5. howardhsu/BERT-for-RRC-ABSA, code for our NAACL 2019 paper: "BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis",

  6. brightmart/sentiment_analysis_fine_grain, Multi-label Classification with BERT; Fine Grained Sentiment Analysis from AI challenger,

BERT NER task:

  1. zhpmatrix/bert-sequence-tagging, 基于BERT的中文序列标注

  2. kyzhouhzau/BERT-NER, Use google BERT to do CoNLL-2003 NER ! ,

  3. king-menin/ner-bert, NER task solution (bert-Bi-LSTM-CRF) with google bert https://github.com/google-research.

  4. macanv/BERT-BiLSMT-CRF-NER, Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning ,

  5. FuYanzhe2/Name-Entity-Recognition, Lstm-crf,Lattice-CRF,bert-ner及近年ner相关论文follow,

  6. mhcao916/NER_Based_on_BERT, this project is based on google bert model, which is a Chinese NER

  7. ProHiryu/bert-chinese-ner, 使用预训练语言模型BERT做中文NER,

  8. sberbank-ai/ner-bert, BERT-NER (nert-bert) with google bert,

  9. kyzhouhzau/Bert-BiLSTM-CRF, This model base on bert-as-service. Model structure : bert-embedding bilstm crf. ,

  10. Hoiy/berserker, Berserker - BERt chineSE woRd toKenizER, Berserker (BERt chineSE woRd toKenizER) is a Chinese tokenizer built on top of Google's BERT model. ,

  11. Kyubyong/bert_ner, Ner with Bert,

  12. jiangpinglei/BERT_ChineseWordSegment, A Chinese word segment model based on BERT, F1-Score 97%,

  13. yanwii/ChineseNER, 基于Bi-GRU + CRF 的中文机构名、人名识别 中文实体识别, 支持google bert模型

  14. lemonhu/NER-BERT-pytorch, PyTorch solution of NER task Using Google AI's pre-trained BERT model.

BERT Text Summarization Task:

  1. nlpyang/BertSum, Code for paper Fine-tune BERT for Extractive Summarization,

  2. santhoshkolloju/Abstractive-Summarization-With-Transfer-Learning, Abstractive summarisation using Bert as encoder and Transformer Decoder,

  3. nayeon7lee/bert-summarization, Implementation of 'Pretraining-Based Natural Language Generation for Text Summarization', Paper: https://arxiv.org/pdf/1902.09243.pdf

  4. dmmiller612/lecture-summarizer, Lecture summarizer with BERT

BERT Text Generation Task:

  1. asyml/texar, Toolkit for Text Generation and Beyond https://texar.io, Texar is a general-purpose text generation toolkit, has also implemented BERT here for classification, and text generation applications by combining with Texar's other modules.

  2. voidful/BertGenerate, Fine tuning bert for text generation, Bert 做 文本生成 的一些實驗

  3. Tiiiger/bert_score, BERT score for language generation,

BERT Knowledge Graph Task :

  1. lvjianxin/Knowledge-extraction, 基于中文的知识抽取,BaseLine:Bi-LSTM+CRF 升级版:Bert预训练

  2. sakuranew/BERT-AttributeExtraction, USING BERT FOR Attribute Extraction in KnowledgeGraph. fine-tuning and feature extraction. 使用基于bert的微调和特征提取方法来进行知识图谱百度百科人物词条属性抽取。,

  3. aditya-AI/Information-Retrieval-System-using-BERT,

  4. jkszw2014/bert-kbqa-NLPCC2017, A trial of kbqa based on bert for NLPCC2016/2017 Task 5 (基于BERT的中文知识库问答实践,代码可跑通),博客介绍 https://blog.csdn.net/ai_1046067944/article/details/86707784 ,

  5. yuanxiaosc/Schema-based-Knowledge-Extraction, Code for http://lic2019.ccf.org.cn/kg 信息抽取。使用基于 BERT 的实体抽取和关系抽取的端到端的联合模型。(将在比赛结束后,完善代码和使用说明),

  6. yuanxiaosc/Entity-Relation-Extraction, Entity and Relation Extraction Based on TensorFlow. 基于TensorFlow的管道式实体及关系抽取,2019语言与智能技术竞赛信息抽取任务解决方案(比赛结束后完善代码)。Schema based Knowledge Extraction, SKE 2019 http://lic2019.ccf.org.cn,

  7. WenRichard/KBQA-BERT, 基于知识图谱的问答系统,BERT做命名实体识别和句子相似度,分为online和outline模式,博客介绍 https://zhuanlan.zhihu.com/p/62946533 ,

BERT Coreference Resolution

  1. ianycxu/RGCN-with-BERT, Gated-Relational Graph Convolutional Networks (RGCN) with BERT for Coreference Resolution Task

  2. isabellebouchard/BERT_for_GAP-coreference, BERT finetuning for GAP unbiased pronoun resolution

BERT visualization toolkit:

  1. jessevig/bertviz, Tool for visualizing BERT's attention,

BERT chatbot :

  1. GaoQ1/rasa_nlu_gq, turn natural language into structured data(支持中文,自定义了N种模型,支持不同的场景和任务),

  2. GaoQ1/rasa_chatbot_cn, 基于rasa-nlu和rasa-core 搭建的对话系统demo,

  3. GaoQ1/rasa-bert-finetune, 支持rasa-nlu 的bert finetune,

  4. geodge831012/bert_robot, 用于智能助手回答问题的训练,基于BERT模型进行训练改造

  5. yuanxiaosc/BERT-for-Sequence-Labeling-and-Text-Classification, This is the template code to use BERT for sequence lableing and text classification, in order to facilitate BERT for more tasks. Currently, the template code has included conll-2003 named entity identification, Snips Slot Filling and Intent Prediction.

  6. guillaume-chevalier/ReuBERT, A question-answering chatbot, simply.

BERT language model and embedding:

  1. hanxiao/bert-as-service, Mapping a variable-length sentence to a fixed-length vector using pretrained BERT model,

  2. YC-wind/embedding_study, 中文预训练模型生成字向量学习,测试BERT,ELMO的中文效果,

  3. Kyubyong/bert-token-embeddings, Bert Pretrained Token Embeddings,

  4. xu-song/bert_as_language_model, bert as language model, fork from https://github.com/google-research/bert,

  5. yuanxiaosc/Deep_dynamic_word_representation, TensorFlow code and pre-trained models for deep dynamic word representation (DDWR). It combines the BERT model and ELMo's deep context word representation.,

  6. imgarylai/bert-embedding, Token level embeddings from BERT model on mxnet and gluonnlp http://bert-embedding.readthedocs.io/,

  7. terrifyzhao/bert-utils, BERT生成句向量,BERT做文本分类、文本相似度计算,

  8. fennuDetudou/BERT_implement, 使用BERT模型进行文本分类,相似句子判断,以及词性标注,

  9. whqwill/seq2seq-keyphrase-bert, add BERT to encoder part for https://github.com/memray/seq2seq-keyphrase-pytorch,

  10. charles9n/bert-sklearn, a sklearn wrapper for Google's BERT model,

  11. NVIDIA/Megatron-LM, Ongoing research training transformer language models at scale, including: BERT,

  12. hankcs/BERT-token-level-embedding, Generate BERT token level embedding without pain

  13. facebookresearch/LAMA, LAMA: LAnguage Model Analysis, LAMA is a set of connectors to pre-trained language models.

BERT Text Match:

  1. pengming617/bert_textMatching, 利用预训练的中文模型实现基于bert的语义匹配模型 数据集为LCQMC官方数据

  2. Brokenwind/BertSimilarity, Computing similarity of two sentences with google's BERT algorithm

  3. policeme/chinese_bert_similarity, bert chinese similarity

  4. lonePatient/bert-sentence-similarity-pytorch, This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.

  5. nouhadziri/DialogEntailment, The implementation of the paper "Evaluating Coherence in Dialogue Systems using Entailment" https://arxiv.org/abs/1904.03371

BERT tutorials:

  1. graykode/nlp-tutorial, Natural Language Processing Tutorial for Deep Learning Researchers https://www.reddit.com/r/MachineLearn…,

  2. dragen1860/TensorFlow-2.x-Tutorials, TensorFlow 2.x version's Tutorials and Examples, including CNN, RNN, GAN, Auto-Encoders, FasterRCNN, GPT, BERT examples, etc. TF 2.0版入门实例代码,实战教程。,