jealous1989's Stars
CompVis/stable-diffusion
A latent text-to-image diffusion model
meta-llama/llama
Inference code for Llama models
dair-ai/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
langflow-ai/langflow
Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
mlfoundations/open_clip
An open source implementation of CLIP.
taskflow/taskflow
A General-purpose Task-parallel Programming System using Modern C++
lllyasviel/stable-diffusion-webui-forge
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
bentoml/BentoML
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
kohya-ss/sd-scripts
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
openkruise/kruise
Automated management of large-scale applications on Kubernetes (incubating project under CNCF)
luosiallen/latent-consistency-model
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
shawwn/llama-dl
High-speed download of LLaMA, Facebook's 65B parameter GPT model
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism etc. 🎉🎉
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
predibase/lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
ChunelFeng/CGraph
【A common used C++ DAG framework】 一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流
run-house/runhouse
Dispatch and distribute your ML training to "serverless" clusters in Python, like PyTorch for ML infra. Iterable, debuggable, multi-cloud/on-prem, identical across research and production.
ChenRocks/UNITER
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
ChunelFeng/CThreadPool
【A simple used C++ threadpool】一个简单好用,性能优异的,跨平台的C++线程池。欢迎 star & fork
NetEase-FuXi/EET
Easy and Efficient Transformer : Scalable Inference Solution For Large NLP model
pybind/pybind11_bazel
Bazel wrapper around the pybind11 repository
tonyduan/diffusion
From-scratch diffusion model implemented in PyTorch.
ColdHeat/pystarlark
Experimental Python bindings for starlark-go