Tengxu-Sun

Hangzhou, China

Tengxu-Sun's Stars

microsoft/BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
Language:Python34129
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Language:Python1.9k107
NUS-HPC-AI-Lab/VideoSys
VideoSys: An easy and efficient system for video generation
Language:Python1.6k109
ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
Language:C34.4k3.5k
ggerganov/ggml
Tensor library for machine learning
Language:C++10.8k1k
dingyuqing05/trt2022_wenet
Language:C++6515
efeslab/Atom
[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Language:Cuda25821
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Language:Python2.3k190
NVIDIA-developer-blog/code-samples
Source code examples from the Parallel Forall Blog
Language:HTML1.2k632
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.6k2.1k
zenny-chen/GPU-architectures-docs-and-demos
各大GPU厂商以及平台商关于3D图形渲染的demo
Language:C131
GrowingGit/GitHub-English-Top-Charts
Help you discover excellent English projects and get rid of disturbing by other spoken language.
Language:Python2.1k194
GrowingGit/GitHub-Chinese-Top-Charts
:cn: GitHub中文排行榜，各语言分设「软件 | 资料」榜单，精准定位中文好项目。各取所需，高效学习。
Language:Java98.7k13.1k
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
11.7k758
feifeibear/long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
Language:Python30718
xdit-project/xDiT
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) on multi-GPU Clusters
Language:Python47840
zhuohan123/terapipe
Language:Python625
koalaman/shellcheck
ShellCheck, a static analysis tool for shell scripts
Language:Haskell36.1k1.8k
huggingface/accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Language:Python7.7k930
THUDM/SwissArmyTransformer
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
Language:Python95591
nicolaswilde/cuda-tensorcore-hgemm
Language:Cuda9519
KnowingNothing/MatmulTutorial
A Easy-to-understand TensorOp Matmul Tutorial
Language:C++26128
wzsh/wmma_tensorcore_sample
Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)
Language:Cuda10917
Bruce-Lee-LY/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Language:Cuda26461
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python5.1k359
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda1.1k102
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python15.8k1.5k
chuanyangjin/fast-DiT
Fast Diffusion Models with Transformers
Language:Python67090
IST-DASLab/QUIK
Repository for the QUIK project, enabling the use of 4bit kernels for generative inference
Language:C++16712
NVIDIA/cccl
CUDA Core Compute Libraries
Language:C++1.1k133

Tengxu-Sun

Tengxu-Sun's Stars

microsoft/BitBLAS

QwenLM/Qwen2-VL

NUS-HPC-AI-Lab/VideoSys

ggerganov/whisper.cpp

ggerganov/ggml

dingyuqing05/trt2022_wenet

efeslab/Atom

ModelTC/lightllm

NVIDIA-developer-blog/code-samples

hpcaitech/Open-Sora

zenny-chen/GPU-architectures-docs-and-demos

GrowingGit/GitHub-English-Top-Charts

GrowingGit/GitHub-Chinese-Top-Charts

BradyFU/Awesome-Multimodal-Large-Language-Models

feifeibear/long-context-attention

xdit-project/xDiT

zhuohan123/terapipe

koalaman/shellcheck

huggingface/accelerate

THUDM/SwissArmyTransformer

nicolaswilde/cuda-tensorcore-hgemm

KnowingNothing/MatmulTutorial

wzsh/wmma_tensorcore_sample

Bruce-Lee-LY/cuda_hgemm

sgl-project/sglang

flashinfer-ai/flashinfer

huggingface/peft

chuanyangjin/fast-DiT

IST-DASLab/QUIK

NVIDIA/cccl