iMason007's Stars
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
sangyc10/CUDA-code
openmlsys/openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese
This is a Chinese translation of the CUDA programming guide
chenzomi12/AISystem
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
ZhangGe6/onnx-modifier
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
owenliang/qwen-vllm
通义千问VLLM推理部署DEMO
modelscope/modelscope
ModelScope: bring the notion of Model-as-a-Service to life.
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
argoproj/argo-workflows
Workflow Engine for Kubernetes
GStreamer/gstreamer
GStreamer open-source multimedia framework
zilliztech/attu
The GUI for Milvus
zaphoyd/websocketpp
C++ websocket client/server library
hiyouga/LLaMA-Factory
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Hannibal046/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
coder/code-server
VS Code in the browser
ossrs/srs
SRS is a simple, high-efficiency, real-time video server supporting RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH, and GB28181.
bluenviron/mediamtx
Ready-to-use SRT / WebRTC / RTSP / RTMP / LL-HLS media server and media proxy that allows to read, publish, proxy, record and playback video and audio streams.
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
moon-hotel/TransformerTranslation
A Transformer Framework Based Translation Task
ggerganov/llama.cpp
LLM inference in C/C++
tloen/alpaca-lora
Instruct-tune LLaMA on consumer hardware
meta-llama/llama
Inference code for Llama models