mental2008

🔥 CS Ph.D. Student at HKUST

HKUSTHong Kong ⇌ Hangzhou

mental2008's Stars

vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python31.1k 253 5.4k4.7k
tqdm/tqdm
:zap: A Fast, Extensible Progress Bar for Python and CLI
Language:Python28.8k 208 1k1.4k
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python16.6k 110 1.1k1.6k
huggingface/candle
Minimalist ML framework for Rust
Language:Rust16k 156 723966
git-lfs/git-lfs
Git extension for versioning large files
Language:Go13.1k 484 3.1k2k
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）
Language:HTML11.3k 95 221.2k
bigscience-workshop/petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
Language:Python9.3k 95 204524
huggingface/text-generation-inference
Large Language Model Text Generation Inference
Language:Python9.2k 103 1.4k1.1k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.8k 94 2k1k
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python6.3k 60 693555
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.9k 62 625894
facebookresearch/fairscale
PyTorch extensions for high performance and large scale training.
Language:Python3.2k 50 359282
leptonai/leptonai
A Pythonic framework to simplify AI service building
Language:Python2.7k 23 66172
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.6k 23 185211
predibase/lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Language:Python2.2k 33 251146
jiqizhixin/Artificial-Intelligence-Terminology-Database
A comprehensive mapping database of English to Chinese technical vocabulary in the artificial intelligence domain
1.9k 84 16330
microsoft/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Language:Python1.9k 41 312175
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Language:Python1.8k 24 3998
alibaba/havenask
Language:C++1.6k 42 189302
ray-project/ray-llm
RayLLM - LLMs on Ray
Language:Python1.2k 20 8994
Azure/AzurePublicDataset
Microsoft Azure Traces
Language:Jupyter Notebook831 37 35143
sail-sg/lorahub
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Language:Python601 11 2236
alibaba/rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Language:C++550 12 9250
Troyciv/anki-templates-superlist
A collection of Anki card styles
478 22 323
OpenGVLab/Multi-Modality-Arena
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!
Language:Python470 6 2735
microsoft/mscclpp
MSCCL++: A GPU-driven communication stack for scalable AI applications
Language:C++252 18 9540
nothingislost/obsidian-workspaces-plus
Quickly switch and manage Obsidian workspaces
Language:TypeScript199 5 1056
eth-easl/orion
An interference-aware scheduler for fine-grained GPU sharing
Language:Python113 2 1716
Hsword/SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
104 2 68
ModelTC/awesome-lm-system
Summary of system papers/frameworks/codes/tools on training or serving large model
56 9 05

mental2008

mental2008's Stars

vllm-project/vllm

tqdm/tqdm

huggingface/peft

huggingface/candle

git-lfs/git-lfs

liguodongiot/llm-action

bigscience-workshop/petals

huggingface/text-generation-inference

NVIDIA/TensorRT-LLM

sgl-project/sglang

NVIDIA/FasterTransformer

facebookresearch/fairscale

leptonai/leptonai

ModelTC/lightllm

predibase/lorax

jiqizhixin/Artificial-Intelligence-Terminology-Database

microsoft/DeepSpeed-MII

S-LoRA/S-LoRA

alibaba/havenask

ray-project/ray-llm

Azure/AzurePublicDataset

sail-sg/lorahub

alibaba/rtp-llm

Troyciv/anki-templates-superlist

OpenGVLab/Multi-Modality-Arena

microsoft/mscclpp

nothingislost/obsidian-workspaces-plus

eth-easl/orion

Hsword/SpotServe

ModelTC/awesome-lm-system