Aaroniley's Stars
facebookresearch/faiss
A library for efficient similarity search and clustering of dense vectors.
Lightning-AI/pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
UKPLab/sentence-transformers
State-of-the-Art Text Embeddings
flairNLP/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Embedding/Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
brightmart/text_classification
all kinds of text classification models and more with deep learning
stanfordnlp/GloVe
Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
bentrevett/pytorch-seq2seq
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
649453932/Chinese-Text-Classification-Pytorch
中文文本分类,TextCNN,TextRNN,FastText,TextRCNN,BiLSTM_Attention,DPCNN,Transformer,基于pytorch,开箱即用。
nlp-with-transformers/notebooks
Jupyter notebooks for the Natural Language Processing with Transformers book
stanford-futuredata/ColBERT
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
datawhalechina/learn-nlp-with-transformers
we want to create a repo to illustrate usage of transformers in chinese
dmis-lab/biobert
Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
facebookresearch/DPR
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
castorini/pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
beir-cellar/beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
zhuifengshen/DingtalkChatbot
钉钉群自定义机器人消息Python封装
NTMC-Community/awesome-neural-models-for-semantic-match
A curated list of papers dedicated to neural text (semantic) matching.
ahangchen/torch_base
Quickly bring up your PyTorch project(a skeleton)
xiaoqian19940510/text-classification-surveys
文本分类资源汇总,包括深度学习文本分类模型,如SpanBERT、ALBERT、RoBerta、Xlnet、MT-DNN、BERT、TextGCN、MGAN、TextCapsule、SGNN、SGM、LEAM、ULMFiT、DGCNN、ELMo、RAM、DeepMoji、IAN、DPCNN、TopicRNN、LSTMN 、Multi-Task、HAN、CharCNN、Tree-LSTM、DAN、TextRCNN、Paragraph-Vec、TextCNN、DCNN、RNTN、MV-RNN、RAE等,浅层学习模型,如LightGBM 、SVM、XGboost、Random Forest、C4.5、CART、KNN、NB、HMM等。介绍文本分类数据集,如MR、SST、MPQA、IMDB、Yelp、20NG、AG、R8、DBpedia、Ohsumed、SQuAD、SNLI、MNLI、MSRP、MRDA、RCV1、AAPD,评价指标,如accuracy、Precision、Recall、F1、EM、MRR、HL、Micro-F1、Macro-F1、P@K,和技术挑战,包括多标签文本分类。
ahangchen/windy-afternoon
Gitbook based Blog, Android, Linux, Deep Learning, Computer Vision
thunlp/SOS4NLP
Survey of Surveys for Natural Language Processing (SOS4NLP)
UKPLab/gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
zhihao-chen/QASystemOnMedicalKG
A tutorial and implement of disease centered Medical knowledge graph and qa system based on it。知识图谱构建,自动问答,基于kg的自动问答。以疾病为中心的一定规模医药领域知识图谱,并以该知识图谱完成自动问答与分析服务。
sfzhou5678/PolyEncoder
An unofficial implementation of Poly-encoder (Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring)
Georgetown-IR-Lab/cedr
Code for CEDR: Contextualized Embeddings for Document Ranking, accepted at SIGIR 2019.
lijqhs/text-classification-cn
中文文本分类实践,基于搜狗新闻语料库,采用传统机器学习方法以及预训练模型等方法
RUCAIBox/PLMPapers
A paper list of pre-trained language models (PLMs).
pl8787/wsdm2021-beyond-prp-tutorial
WSDM2021 Tutorial: Beyond Probability Ranking Principle: Modeling the Dependencies among Documents
irgroup/trec-covid
As part of the TREC-COVID challenge the Information Retrieval Research Group at Technische Hochschule Köln develops search and retrieval algorithms to support the search for relevant information on COVID-19.