xiaofan-luan's Stars
zilliztech/milvus-operator
The Kubernetes Operator of Milvus.
evolutionaryscale/esm
facebookincubator/nimble
New file format for storage of large columnar datasets.
ibireme/yyjson
The fastest JSON library in C
google/highway
Performance-portable, length-agnostic SIMD with runtime dispatch
oshizo/JapaneseEmbeddingEval
tmc/langchaingo
LangChain for Go, the easiest way to write LLM-based programs in Go
LinkedInAttic/scanns
A scalable nearest neighbor search library in Apache Spark
Filimoa/open-parse
Improved file parsing for LLM’s
danswer-ai/danswer
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
milvus-io/milvus-model
The embedding/reranking model zoo help user to convert their unstructured data into embeedings
milvus-io/milvus
A cloud-native vector database, storage for next generation AI applications
redpanda-data/redpanda
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
superlinked/VectorHub
VectorHub is a free, open-source learning website for people (software developers to senior ML architects) interested in adding vector retrieval to their ML stack.
whyhow-ai/rule-based-retrieval
The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applications with advanced filtering capabilities. It seamlessly integrates with OpenAI for text generation and Pinecone or Milvus for efficient vector database management.
rapidsai/cuvs
cuVS - a library for vector search and clustering on the GPU
hopshadoop/hops
Hops Hadoop is a distribution of Apache Hadoop with distributed metadata.
zilliztech/Retriever-for-GPTs
An external retriever for GPTs implemented with Zilliz Cloud Pipelines, a more flexible and economic alternative to default GPTs knowledge base.
naver/splade
SPLADE: sparse neural search (SIGIR21, SIGIR22)
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
google/space
Unified storage framework for the entire machine learning lifecycle
castorini/pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
zilliztech/spark-milvus
ContextualAI/gritlm
Generative Representational Instruction Tuning
AnswerDotAI/RAGatouille
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
emilymaier/cmemory
Memory management tools for cgo.
leptonai/search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
intel/ScalableVectorSearch
wxywb/history_rag
Xingrun-Xing/BiPFT
This is the implementation of our AAAI2024 paper: BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual Polynomials.