information-retrieval
There are 2841 repositories under information-retrieval topic.
scilla
Information Gathering tool - DNS / Subdomains / Ports / Directories enumeration
anserini
Anserini is a Lucene toolkit for reproducible information retrieval research
GNN-Recommender-Systems
An index of recommendation algorithms that are based on Graph Neural Networks. (TORS)
awesome-ai-web-search
List of software that allows searching the web with the assistance of AI: https://hf.co/spaces/felladrin/awesome-ai-web-search
pisa
PISA: Performant Indexes and Search for Academia
allRank
allRank is a framework for training learning-to-rank neural models based on PyTorch.
raft
RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
notes
Learn about Machine Learning and Artificial Intelligence
splade
SPLADE: sparse neural search (SIGIR21, SIGIR22)
sgpt
SGPT: GPT Sentence Embeddings for Semantic Search
RocketQA
🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
awesome-neural-models-for-semantic-match
A curated list of papers dedicated to neural text (semantic) matching.
awesome-persian-nlp-ir
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
RAG-FiT
Framework for enhancing LLMs for RAG tasks using fine-tuning.
toolfront
Data retrieval for AI agents
talisman
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
EmbedAnything
Highly Performant, Modular and Production-ready Inference, Ingestion and Indexing built in Rust 🦀
tevatron
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
teaching
Open-Source Information Retrieval Courses @ TU Wien
gritlm
Generative Representational Instruction Tuning
s3
[EMNLP'25] s3 - ⚡ Efficient & Effective Search Agent Training via RL for RAG (Verifier-Powered RLVR for Search)
awesome-pretrained-models-for-information-retrieval
A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).
RankGPT
Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]
DeepRetrieval
[COLM'25] DeepRetrieval - 🔥 Training Search Agent with Retrieval Outcomes via Reinforcement Learning
cdQA
⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
DensePhrases
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624
ranx
⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍
xyne
AI-first Search & Answer Engine for work. Open-source alternative to Glean.
pylate
Late Interaction Models Training & Retrieval
resin
Vector space index based search engine that's available as a HTTP service or as an embedded library.
sycamore
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
NLP-Projects
word2vec, sentence2vec, machine reading comprehension, dialog system, text classification, pretrained language model (i.e., XLNet, BERT, ELMo, GPT), sequence labeling, information retrieval, information extraction (i.e., entity, relation and event extraction), knowledge graph, text generation, network embedding
AnglE
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
Automated-Fact-Checking-Resources
Links to conference/journal publications in automated fact-checking (resources for the TACL22/EMNLP23 paper).
Deep-Semantic-Similarity-Model
My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.