information-retrieval
There are 2569 repositories under information-retrieval topic.
JaidedAI/EasyOCR
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
deepset-ai/haystack
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
piskvorky/gensim
Topic Modelling for Humans
arc53/DocsGPT
Chatbot for documentation, that allows you to chat with your data. Privately deployable, provides AI knowledge sharing and integrates knowledge into your AI workflow
weaviate/weaviate
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
onyx-dot-app/onyx
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
neuml/txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
marqo-ai/marqo
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
apache/lucene-solr
Apache Lucene and Solr open-source search software
KittyKatt/screenFetch
Fetches system/theme information in terminal for Linux desktop screenshots.
catalyst-team/catalyst
Accelerated deep learning R&D
langroid/langroid
Harness LLMs with Multi-Agent Programming
apache/lucene
Apache Lucene open-source search software
tensorflow/ranking
Learning to Rank in TensorFlow
naiveHobo/InvoiceNet
Deep neural network to extract intelligent information from invoice documents.
SylphAI-Inc/AdalFlow
AdalFlow: The library to build & auto-optimize LLM applications.
ashvardanian/StringZilla
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
rajkumardusad/IP-Tracer
Track any ip address with IP-Tracer. IP-Tracer is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracer.
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
xlang-ai/instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
castorini/pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
shaoxiongji/knowledge-graphs
A collection of research on knowledge graphs
beir-cellar/beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
youngfish42/Awesome-FL
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
boudinfl/pke
Python Keyphrase Extraction module
th3unkn0n/TeleGram-Scraper
telegram group scraper tool. fetch all information about group members
IntelLabs/fastRAG
Efficient Retrieval Augmentation and Generation Framework
illuin-tech/colpali
The code used to train and run inference with the ColPali architecture.
th3unkn0n/osi.ig
Information Gathering Instagram.
apache/solr
Apache Solr open-source search software
ashvardanian/SimSIMD
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
dorianbrown/rank_bm25
A Collection of BM25 Algorithms in Python
castorini/anserini
Anserini is a Lucene toolkit for reproducible information retrieval research
tsinghua-fib-lab/GNN-Recommender-Systems
An index of recommendation algorithms that are based on Graph Neural Networks. (TORS)