embedding-models

There are 252 repositories under embedding-models topic.

  • Separius/awesome-sentence-embedding

    A curated list of pretrained sentence and word embedding models

    Language:Python2.3k7719262
  • Hironsan/awesome-embedding-models

    A curated list of awesome embedding models tutorials, projects and communities.

    Language:Jupyter Notebook1.8k1064250
  • StarlightSearch/EmbedAnything

    Highly Performant, Modular, Memory Safe and Production-ready Inference, Ingestion and Indexing built in Rust 🦀

    Language:Rust716105663
  • ContextualAI/gritlm

    Generative Representational Instruction Tuning

    Language:Jupyter Notebook672115849
  • Sujit-O/pykg2vec

    Python library for knowledge graph embedding and representation learning.

    Language:Python6181298114
  • marl/openl3

    OpenL3: Open-source deep audio and image embeddings

    Language:Jupyter Notebook540106861
  • Denis2054/RAG-Driven-Generative-AI

    This repository provides programs to build Retrieval Augmented Generation (RAG) code for Generative AI with LlamaIndex, Deep Lake, and Pinecone leveraging the power of OpenAI and Hugging Face models for generation and evaluation.

    Language:Jupyter Notebook502103162
  • BBC-Esq/VectorDB-Plugin

    Plugin that lets you ask questions about your documents including audio and video files.

    Language:Python348737045
  • mana-ysh/knowledge-graph-embeddings

    Implementations of Embedding-based methods for Knowledge Base Completion tasks

    Language:Python259241363
  • image_search_engine

    CVxTz/image_search_engine

    Image search engine

    Language:Python23114740
  • lgalke/vec4ir

    Word Embeddings for Information Retrieval

    Language:Python22512641
  • yusufhilmi/client-vector-search

    A client side vector search library that can embed, store, search, and cache vectors. Works on the browser and node. It outperforms OpenAI's text-embedding-ada-002 and is way faster than Pinecone and other VectorDBs.

    Language:TypeScript2175615
  • spcl/ncc

    Neural Code Comprehension: A Learnable Representation of Code Semantics

    Language:Python213123151
  • webvectors

    akutuzov/webvectors

    Web-ify your word2vec: framework to serve distributional semantic models online

    Language:Python201123747
  • mangopy/tool-retrieval-benchmark

    Official code for ACL2025 "🔍 Retrieval Models Aren’t Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models"

    Language:JavaScript195104
  • jgraving/selfsne

    Self-Supervised Noise Embeddings (Self-SNE)

    Language:Jupyter Notebook15832012
  • ALucek/QuicKB

    Optimize Document Retrieval with Fine-Tuned KnowledgeBases

    Language:Python1533029
  • formath/tensorflow-predictor-cpp

    tensorflow prediction using c++ api

    Language:Python15071159
  • shobrook/weightgain

    Train an adapter for any embedding model in under a minute

    Language:Python126206
  • p768lwy3/torecsys

    ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms, including but not limited to click-through-rate (CTR) prediction, learning-to-ranking (LTR), and Matrix/Tensor Embedding. The project objective is to develop an ecosystem to experiment, share, reproduce, and deploy in real-world in a smooth and easy way.

    Language:Python1044317
  • su-park/mteb_ko_leaderboard

    한글 텍스트 임베딩 모델 리더보드

  • HITsz-TMG/KaLM-Embedding

    Code for KaLM-Embedding models

    Language:Python91336
  • shamspias/langchain-chat

    langchain-chat is an AI-driven Q&A system that leverages OpenAI's GPT-4 model and FAISS for efficient document indexing. It loads and splits documents from websites or PDFs, remembers conversations, and provides accurate, context-aware answers based on the indexed data. Easy to set up and extend.

    Language:Python887117
  • kaushalshetty/Positional-Encoding

    Encoding position with the word embeddings.

    Language:Jupyter Notebook835213
  • ikergarcia1996/MetaVec

    A monolingual and cross-lingual meta-embedding generation and evaluation framework

    Language:Python80416
  • D2KLab/entity2vec

    Generates a set of property-specific entity embeddings from knowledge graphs using node2vec

    Language:Python7711324
  • KERMIT

    ART-Group-it/KERMIT

    🐸 KERMIT - A lightweight library to encode and interpret Universal Syntactic Embeddings

    Language:JavaScript58619
  • nsrinidhibhat/gradio_RAG

    Code and resources showcasing the Retrieval-Augmented Generation (RAG) technique, a solution for enhancing data freshness in Large Language Models (LLMs). Incorporate up-to-date external knowledge into LLM-generated responses. Additionally, this repository includes a Gradio-based user interface for seamless model deployment.

    Language:Python562014
  • datquocnguyen/STransE

    STransE: a novel embedding model of entities and relationships in knowledge bases (NAACL 2016)

    Language:C++546316
  • oracle/ai-optimizer

    GenAI/RAG Optimizer and Toolkit for experimentation using Oracle Database AI Vector Search

    Language:Python531413028
  • RoyZhengGao/edge2vec

    Learning node representation using edge semantics

    Language:Python534722
  • Glaciohound/VCML

    PyTorch implementation of paper "Visual Concept-Metaconcept Learner", NeruIPS 2019

    Language:Python49388
  • alisonbma/aiSFX

    Representation Learning for the Automatic Indexing of Sound Effects Libraries (ISMIR 2022): Deep audio embeddings pre-trained on UCS & Non-UCS-compliant datasets.

    Language:Python45424
  • worldbank/GISTEmbed

    GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings

    Language:Python44433
  • machinelearningZH/semantic-search-eval

    A framework for evaluating semantic search across custom datasets, metrics, and embedding backends.

    Language:Python356
  • friedrichor/UNITE

    official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"

    Language:Python33