lihuahua123's Stars
OpenWebGAL/WebGAL
A brand new web Visual Novel engine | 全新的网页端视觉小说引擎
microsoft/autogen
A programming framework for agentic AI 🤖
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
BBuf/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
opendilab/LLMRiddles
Open-Source Reproduction/Demo of the LLM Riddles Game
Tencent/PatrickStar
PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP and democratizes AI for everyone.
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
luoxi-model/luoxi_models
see readme
torchpipe/torchpipe
Serving Inside Pytorch
lihuahua123/Rayflow
a simple machine learning using ray
nndeploy/nndeploy
nndeploy是一款模型端到端部署框架。以多端推理以及基于有向无环图模型部署为基础,致力为用户提供跨平台、简单易用、高性能的模型部署体验。
zjhellofss/KuiperInfer
校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
crossplane/crossplane
The Cloud Native Control Plane
suquark/ExoFlow
A universal workflow system for exactly-once DAGs
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
reconfigurable-ml-pipeline/ipa
Source code of IPA, https://escholarship.org/uc/item/2p0805dq
facebookresearch/distributed_traces
Distributed tracing data from Meta's microservices architecture.
modelbox-ai/modelbox
A high performance, high expansion, easy to use framework for AI application. 为AI应用的开发者提供一套统一的高性能、易用的编程框架,快速基于AI全栈服务、开发跨端边云的AI行业应用,支持GPU,NPU加速。
sunface/rust-course
“连续八年成为全世界最受喜爱的语言,无 GC 也无需手动内存管理、极高的性能和安全性、过程/OO/函数式编程、优秀的包管理、JS 未来基石" — 工作之余的第二语言来试试 Rust 吧。本书拥有全面且深入的讲解、生动贴切的示例、德芙般丝滑的内容,这可能是目前最用心的 Rust 中文学习教程 / Book
bytewax/bytewax
Python Stream Processing
coderonion/awesome-llm-and-aigc
🚀🚀🚀A collection of some awesome public projects about Large Language Model, Vision Foundation Model and AI Generated Content.
ztxz16/fastllm
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
zahranajaf/PROS
SymbioticLab/Kayak
Proactive-adaptive arbitration between shipping compute and shipping data
bug-developer021/YOLOV5_optimization_on_triton
Compare multiple optimization methods on triton to imporve model service performance
fkh12345/ICE
yxtj/VideoServing
alpa-projects/mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
PaddlePaddle/Serving
A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)