iMason007

iMason007's Stars

BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
10.8k719
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Language:Python3.5k316
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python20.9k2k
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Language:Python18k1.8k
sangyc10/CUDA-code
Language:Cuda55069
openmlsys/openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
Language:Cuda11315
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。
13.7k1.3k
HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese
This is a Chinese translation of the CUDA programming guide
1.1k164
chenzomi12/AISystem
AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Language:Jupyter Notebook9.7k1.4k
ZhangGe6/onnx-modifier
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
Language:JavaScript1.2k154
owenliang/qwen-vllm
通义千问VLLM推理部署DEMO
Language:Python33442
modelscope/modelscope
ModelScope: bring the notion of Model-as-a-Service to life.
Language:Python6.6k679
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python12.6k1.1k
argoproj/argo-workflows
Workflow Engine for Kubernetes
Language:Go14.7k3.1k
GStreamer/gstreamer
GStreamer open-source multimedia framework
Language:C2.2k563
zilliztech/attu
The GUI for Milvus
Language:TypeScript1.1k109
zaphoyd/websocketpp
C++ websocket client/server library
Language:C++6.9k1.9k
hiyouga/LLaMA-Factory
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python27.7k3.4k
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python12.8k1k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++7.7k834
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python23.6k3.4k
Hannibal046/Awesome-LLM
Awesome-LLM: a curated list of Large Language Model
16.3k1.3k
coder/code-server
VS Code in the browser
Language:TypeScript66.7k5.5k
ossrs/srs
SRS is a simple, high-efficiency, real-time video server supporting RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH, and GB28181.
Language:C++24.9k5.3k
bluenviron/mediamtx
Ready-to-use SRT / WebRTC / RTSP / RTMP / LL-HLS media server and media proxy that allows to read, publish, proxy, record and playback video and audio streams.
Language:Go11k1.4k
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
Language:Python38.4k4.3k
moon-hotel/TransformerTranslation
A Transformer Framework Based Translation Task
Language:Python12237
ggerganov/llama.cpp
LLM inference in C/C++
Language:C++62.5k9k
tloen/alpaca-lora
Instruct-tune LLaMA on consumer hardware
Language:Jupyter Notebook18.4k2.2k
meta-llama/llama
Inference code for Llama models
Language:Python54.6k9.4k