Jeffwan's Stars
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
chenfei-wu/TaskMatrix
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
eosphoros-ai/DB-GPT
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
cpacker/MemGPT
Letta (fka MemGPT) is a framework for creating stateful LLM services.
huggingface/text-generation-inference
Large Language Model Text Generation Inference
InternLM/InternLM
Official release of InternLM2.5 base and chat models. 1M context support
xlang-ai/OpenAgents
[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
predibase/lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
spegel-org/spegel
Stateless cluster local OCI registry mirror.
jbilcke-hf/ai-comic-factory
Generate comic panels using a LLM + SDXL. Powered by Hugging Face 🤗
v6d-io/v6d
vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)
efeslab/Nanoflow
A throughput-oriented high-performance serving framework for LLMs
BeachWang/DAIL-SQL
A efficient and effective few-shot NL2SQL method on GPT-4.
conceptofmind/toolformer
DataDog/watermarkpodautoscaler
Custom controller that extends the Horizontal Pod Autoscaler
lambda7xx/awesome-AI-system
paper and its code for AI System
ServerlessLLM/ServerlessLLM
Scalable and Efficient Serverless Deployment for Large AI Models.
FlagOpen/FlagInstruct
ACL2023-Retrieval-LM/ACL2023-Retrieval-LM.github.io
https://acl2023-retrieval-lm.github.io/
kserve/modelmesh
Distributed Model Serving Framework
allegroai/clearml-serving
ClearML - Model-Serving Orchestration and Repository Solution
AlibabaPAI/llumnix
Efficient and easy multi-instance LLM serving
Hsword/SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
CloudNativeGame/aigc-gateway
A user gateway that provides serverless AIGC experience.
volcengine/veTurboIO
A library developed by Volcano Engine for high-performance reading and writing of PyTorch model files.
LLMFlow/LLMFlow
Easy, Fast, Secure and Cost-Efficient LLM Pipelines to generate GhatGPT-like private domain models and knowledgeable agents for your organization.
nianhuatiandi/Fast-Distributed-Inference-Serving-for-Large-Language-Models
Fast Distributed Inference Serving for Large Language Models