semantic-search

There are 1192 repositories under semantic-search topic.

  • generative-ai-for-beginners

    microsoft/generative-ai-for-beginners

    21 Lessons, Get Started Building with Generative AI

    Language:Jupyter Notebook102k89719453.9k
  • meilisearch

    meilisearch/meilisearch

    A lightning-fast search engine API bringing AI-powered hybrid search to your sites and applications.

    Language:Rust54.4k2912.3k2.3k
  • khoj

    khoj-ai/khoj

    Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

    Language:Python31.5k1545551.9k
  • typesense/typesense

    Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

    Language:C++24.7k1351.8k829
  • haystack

    deepset-ai/haystack

    AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

    Language:MDX23.3k1584.1k2.5k
  • arc53/DocsGPT

    Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.

    Language:Python17.3k965761.9k
  • weaviate

    weaviate/weaviate

    Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

    Language:Go15k1312.6k1.1k
  • txtai

    neuml/txtai

    💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows

    Language:Python11.8k108927757
  • Olow304/memvid

    Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

    Language:Python10.3k7770876
  • lancedb/lancedb

    Developer-friendly, embedded retrieval engine for multimodal AI. Search More; Manage Less.

    Language:Rust7.9k411.1k639
  • zilliztech/GPTCache

    Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

    Language:Python7.8k54179566
  • Tencent/WeKnora

    LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.

    Language:Go7.4k45276842
  • superduper

    superduper-io/superduper

    Superduper: End-to-end framework for building custom AI applications and agents.

    Language:Python5.2k431.4k533
  • marqo

    marqo-ai/marqo

    Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

    Language:Python5k37246218
  • zilliztech/claude-context

    Code search MCP for Claude Code. Make entire codebase the context for any coding agent.

    Language:TypeScript4.4k2783389
  • USearch

    unum-cloud/USearch

    Fast Open-Source Search & Clustering engine × for Vectors & Arbitrary Objects × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

    Language:C++3.2k37241233
  • cocoindex

    cocoindex-io/cocoindex

    Data transformation framework for AI. Ultra performant, with incremental processing. 🌟 Star if you like it!

    Language:Rust3.2k20225260
  • awesome-generative-ai

    filipecalegario/awesome-generative-ai

    A curated list of Generative AI tools, works, models, and references

  • docarray

    docarray/docarray

    Represent, send, store and search multimodal data

    Language:Python3.1k46641231
  • ddangelov/Top2Vec

    Top2Vec learns jointly embedded topic, document and word vectors.

    Language:Python3.1k36338377
  • embeddings-benchmark/mteb

    MTEB: Massive Text Embedding Benchmark

    Language:Python3k171.3k501
  • pinecone-io/examples

    Jupyter Notebooks to help you get hands-on with Pinecone vector databases

    Language:Jupyter Notebook3k49471.1k
  • gmpetrov/databerry

    The no-code platform for building custom LLM Agents

  • mazzzystar/Queryable

    Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.

    Language:Swift2.9k1041444
  • rom1504/clip-retrieval

    Easily compute clip embeddings and build a clip retrieval system with them

    Language:Jupyter Notebook2.7k24236238
  • semantra

    freedmand/semantra

    Multi-tool for semantic search

    Language:Python2.7k3363156
  • milvus-io/bootcamp

    Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.

    Language:Jupyter Notebook2.3k35269657
  • kernel-memory

    microsoft/kernel-memory

    RAG architecture: index and query any data using LLM and natural language, track sources, show citations, asynchronous memory patterns.

    Language:C#2.1k510396
  • NotJoeMartinez/yt-fts

    YouTube Full Text Search - Search all of YouTube from the command line

    Language:Python1.8k119395
  • fastRAG

    IntelLabs/fastRAG

    Efficient Retrieval Augmentation and Generation Framework

    Language:Python1.7k1636164
  • frutik/awesome-search

    Awesome Search - this is all about the (e-commerce, but not only) search and its awesomeness

    Language:HTML1.5k6415130
  • superlinked/superlinked

    Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.

    Language:Jupyter Notebook1.4k2857109
  • aws-genai-llm-chatbot

    aws-samples/aws-genai-llm-chatbot

    A modular and comprehensive solution to deploy a Multi-LLM and Multi-RAG powered chatbot (Amazon Bedrock, Anthropic, HuggingFace, OpenAI, Meta, AI21, Cohere, Mistral) using AWS CDK on AWS

    Language:TypeScript1.3k26313425
  • lotus-data/lotus

    Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, accurate query processing, that's as simple as writing Pandas code

    Language:Python1.3k1563114
  • gnes

    gnes-ai/gnes

    GNES is Generic Neural Elastic Search, a cloud-native semantic search system based on deep neural network.

    Language:Python1.3k5223210
  • UForm

    unum-cloud/UForm

    Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

    Language:Python1.2k153876