embedding

There are 691 repositories under embedding topic.

  • chatchat-space/Langchain-Chatchat

    Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain

    Language:Python36.5k2904.2k6.1k
  • PaddleNLP

    PaddlePaddle/PaddleNLP

    Easy-to-use and powerful LLM and SLM library with awesome model zoo.

    Language:Python12.8k963.8k3.1k
  • Embedding/Chinese-Word-Vectors

    100+ Chinese Word Vectors 上百种预训练中文词向量

    Language:Python12.1k2821692.3k
  • modelscope/ms-swift

    Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).

    Language:Python10.9k463.8k953
  • myreader-io/myGPTReader

    A community-driven way to read and chat with AI bots - powered by chatGPT.

    Language:Python4.4k4835451
  • zilliztech/claude-context

    Code search MCP for Claude Code. Make entire codebase the context for any coding agent.

    Language:TypeScript4.4k2783389
  • infiniflow/infinity

    The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text.

    Language:C++4.2k43576392
  • embedding-atlas

    apple/embedding-atlas

    Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.

    Language:TypeScript4k3333203
  • adambielski/siamese-triplet

    Siamese and triplet networks with online pair/triplet mining in PyTorch

    Language:Python3.2k4769633
  • telegram-search

    groupultra/telegram-search

    🔍 Search your telegram messages wisely | 搜索您的 Telegram 聊天记录

    Language:TypeScript3.1k1272213
  • run-llama/LlamaIndexTS

    Data framework for your LLM applications. Focus on server side solution

    Language:TypeScript2.9k22457492
  • devflowinc/trieve

    All-in-one platform for search, recommendations, RAG, and analytics offered via API

    Language:Rust2.6k141.3k229
  • awesome-community-detection

    benedekrozemberczki/awesome-community-detection

    A curated list of community detection research papers with implementations.

    Language:Python2.4k1078360
  • OpenBMB/UltraRAG

    UltraRAG 2.0: Less Code, Lower Barrier, Faster Deployment! MCP-based low-code RAG framework, enabling researchers to build complex pipelines to creative innovation.

    Language:Python1.8k2156153
  • node-llama-cpp

    withcatai/node-llama-cpp

    Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

    Language:TypeScript1.7k20130155
  • pavlin-policar/openTSNE

    Extensible, parallel implementations of t-SNE

    Language:Python1.6k19143175
  • all-in-rag

    datawhalechina/all-in-rag

    🔍大模型应用开发实战一:RAG技术全栈指南,在线阅读地址:https://datawhalechina.github.io/all-in-rag/

    Language:Python1.4k426671
  • vercel/modelfusion

    The TypeScript library for building AI applications.

    Language:TypeScript1.3k126691
  • onestardao/WFGY

    WFGY 2.0. Semantic Reasoning Engine for LLMs (MIT). Fixes RAG/OCR drift, collapse & “ghost matches” via symbolic overlays + logic patches. Autoboot; OneLine & Flagship. ⭐ Star if you explore semantic RAG or hallucination mitigation.

    Language:Python1.2k226108
  • myscale/MyScaleDB

    A @ClickHouse fork that supports high-performance vector search and full-text search.

    Language:C++994121767
  • SkywalkerDarren/chatWeb

    ChatWeb can crawl web pages, read PDF, DOCX, TXT, and extract the main content, then answer your questions based on the content, or summarize the key points.

    Language:Python9111915137
  • zhezhaoa/ngram2vec

    Four word embedding models implemented in Python. Supporting arbitrary context features

    Language:Python8496223172
  • ContextualAI/gritlm

    Generative Representational Instruction Tuning

    Language:Jupyter Notebook67795848
  • OysterQAQ/ACG2vec

    ACG2vec (Anime Comics Games to vector) are committed to creating a playground that combines ACG and Deep learning.(文本语义检索、以图搜图、语义搜图、图片超分辨率、推荐系统)

  • llm-tools/embedJs

    A NodeJS RAG framework to easily work with LLMs and embeddings

    Language:TypeScript568912270
  • shawroad/NLP_pytorch_project

    Embedding, NMT, Text_Classification, Text_Generation, NER etc.

    Language:Python566717125
  • cvxgrp/pymde

    Minimum-distortion embedding with PyTorch

    Language:Python56285027
  • marl/openl3

    OpenL3: Open-source deep audio and image embeddings

    Language:Jupyter Notebook55396861
  • cvqluu/Angular-Penalty-Softmax-Losses-Pytorch

    Angular penalty loss functions in Pytorch (ArcFace, SphereFace, Additive Margin, CosFace)

    Language:Python494111893
  • chipper

    TilmanGriesel/chipper

    ✨ AI interface for tinkerers (Ollama, Haystack RAG, Python)

    Language:Python4716844
  • TIGER-AI-Lab/VLM2Vec

    This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]

    Language:Python4601014143
  • luyug/GradCache

    Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint

    Language:Python41583026
  • guangzhengli/vectorhub

    Quickly and easily build AI website or application by using embeddings!

    Language:TypeScript3864742
  • aquila

    Aquila-Network/aquila

    An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.

    Language:HTML380204225
  • PaddlePaddle/ERNIE-SDK

    ERNIE Bot Agent is a Large Language Model (LLM) Agent Framework, powered by the advanced capabilities of ERNIE Bot and the platform resources of Baidu AI Studio.

    Language:Jupyter Notebook376105354
  • LnYo-Cly/ai4j

    一款JavaSDK用于快速接入AI大模型应用,整合多平台大模型,如OpenAi、智谱Zhipu(ChatGLM)、深度求索DeepSeek、月之暗面Moonshot(Kimi)、腾讯混元Hunyuan、零一万物(01)等等,提供统一的输入输出(对齐OpenAi)消除差异化,优化函数调用(Tool Call),优化RAG调用、支持向量数据库(Pinecone)、内置联网增强,并且支持JDK1.8,为用户提供快速整合AI的能力。

    Language:Java36485049