information-retrieval
There are 2384 repositories under information-retrieval topic.
JaidedAI/EasyOCR
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
piskvorky/gensim
Topic Modelling for Humans
deepset-ai/haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
weaviate/weaviate
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
danswer-ai/danswer
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
neuml/txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Unstructured-IO/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
apache/lucene-solr
Apache Lucene and Solr open-source search software
marqo-ai/marqo
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
KittyKatt/screenFetch
Fetches system/theme information in terminal for Linux desktop screenshots.
catalyst-team/catalyst
Accelerated deep learning R&D
tensorflow/ranking
Learning to Rank in TensorFlow
apache/lucene
Apache Lucene open-source search software
naiveHobo/InvoiceNet
Deep neural network to extract intelligent information from invoice documents.
rajkumardusad/IP-Tracer
Track any ip address with IP-Tracer. IP-Tracer is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracer.
ashvardanian/StringZilla
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores, etc 🦖
langroid/langroid
Harness LLMs with Multi-Agent Programming
xlang-ai/instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
shaoxiongji/knowledge-graphs
A collection of research on knowledge graphs
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
boudinfl/pke
Python Keyphrase Extraction module
castorini/pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
beir-cellar/beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
th3unkn0n/TeleGram-Scraper
telegram group scraper tool. fetch all information about group members
youngfish42/Awesome-FL
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
th3unkn0n/osi.ig
Information Gathering Instagram.
apache/solr
Apache Solr open-source search software
IntelLabs/fastRAG
Efficient Retrieval Augmentation and Generation Framework
castorini/anserini
Anserini is a Lucene toolkit for reproducible information retrieval research
brylevkirill/notes
Learn about Machine Learning and Artificial Intelligence
tsinghua-fib-lab/GNN-Recommender-Systems
An index of recommendation algorithms that are based on Graph Neural Networks. (TORS)
dorianbrown/rank_bm25
A Collection of BM25 Algorithms in Python
pisa-engine/pisa
PISA: Performant Indexes and Search for Academia
Muennighoff/sgpt
SGPT: GPT Sentence Embeddings for Semantic Search