wxd000000

wxd000000's Stars

nomic-ai/gpt4all
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
Language:C++69.7k 639 1.9k7.6k
mlabonne/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Language:Jupyter Notebook37.8k 396 674k
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python34.3k 287 1.1k4.2k
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python33.5k 204 1.2k3.8k
hiyouga/LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
Language:Python31.9k 204 4.9k3.9k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.6k 115 1k1.2k
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Language:Python12k 170 233816
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.3k 89 1.8k936
jzhang38/TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Language:Python7.7k 108 156453
Oneflow-Inc/oneflow
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
Language:C++5.9k 144 968667
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++5.8k 62 625889
yangjianxin1/Firefly
Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Language:Python5.7k 56 279518
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python5.4k 55 541396
Jeevan-kumar-Raj/Grokking-System-Design
Systems design is the process of defining the architecture, modules, interfaces, and data for a system to satisfy specified requirements. Systems design could be seen as the application of systems theory to product development.
Language:Shell5.1k 64 31.4k
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
Language:Python3.1k 35 71159
DLLXW/baby-llama2-chinese
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库；24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
Language:Python2.5k 17 75305
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Language:Python2k 44 125138
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python1.9k 35 325309
microsoft/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Language:Python1.9k 41 301174
S-LoRA/S-LoRA
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Language:Python1.7k 24 3991
ymcui/Chinese-LLaMA-Alpaca-3
中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3
Language:Python1.6k 19 77142
DefTruth/CUDA-Learn-Notes
🎉 Modern CUDA Learn Notes with PyTorch: fp32, fp16, bf16, fp8/int8, flash_attn, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.
Language:Cuda1.2k 14 6133
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda1.2k 16 106115
mlfoundations/dclm
DataComp for Language Models
Language:HTML1.1k 38 54100
HuangOwen/Awesome-LLM-Compression
Awesome LLM compression research papers and tools.
1.1k 41 366
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
1.1k 12 422
AIoT-MLSys-Lab/Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey
975 24 1183
SafeAILab/EAGLE
Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
Language:Python784 11 12380
Jack47/hack-SysML
The road to hack SysML and become an system expert
Language:Emacs Lisp426 32 249
hemingkx/SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
378 18 315