Stonesjtu's Stars
ggerganov/llama.cpp
LLM inference in C/C++
xai-org/grok-1
Grok open release
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
2noise/ChatTTS
A generative speech model for daily dialogue.
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
meta-llama/llama3
The official Meta Llama 3 GitHub site
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
KindXiaoming/pykan
Kolmogorov Arnold Networks
state-spaces/mamba
Mamba SSM architecture
InstantID/InstantID
InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
vosen/ZLUDA
CUDA on non-NVIDIA GPUs
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
google/gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
pytorch/torchtitan
A native PyTorch Library for large model training
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Tele-AI/Telechat
HazyResearch/ThunderKittens
Tile primitives for speedy kernels
X-LANCE/AniTalker
[ACM MM 2024] This is the official code for "AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding"
NVIDIA/cccl
CUDA Core Compute Libraries
tspeterkim/flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
Jokeren/Awesome-GPU
Awesome resources for GPUs
HazyResearch/aisys-building-blocks
Building blocks for foundation models.
efeslab/Atom
[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
proger/accelerated-scan
Accelerated First Order Parallel Associative Scan
NVlabs/cub
THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.
UofT-EcoSystem/Minuet
[EuroSys'24] Minuet: Accelerating 3D Sparse Convolutions on GPUs
X-LANCE/PaperReading
整理各研究方向经典论文
Stonesjtu/Awesome-GPU
Awesome resources for GPUs