maodoudou168

maodoudou168's Stars

AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
Language:Python145k 1.1k 7.7k27.2k
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
Language:Python39k 384 1.7k4.3k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python32.7k 271 5.8k5k
mli/paper-reading
深度学习经典、新论文逐段精读
27.6k 734 02.5k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python14.8k 124 1.2k1.4k
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）
Language:HTML12.2k 101 231.3k
TheLastBen/fast-stable-diffusion
fast-stable-diffusion + DreamBooth
Language:Python7.6k 87 2.1k1.3k
bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
Language:Python6.5k 52 1.1k644
microsoft/DeepSpeedExamples
Example models using DeepSpeed
Language:Python6.2k 75 5431.1k
AutoGPTQ/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python4.6k 31 471490
qwopqwop200/GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
Language:Python3k 42 218460
ai-forever/Kandinsky-2
Kandinsky 2 — multilingual text2image latent diffusion model
Language:Jupyter Notebook2.8k 49 88309
mit-han-lab/llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Language:Python2.6k 24 193216
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Language:Python2.3k 33 209258
siliconflow/onediff
OneDiff: An out-of-the-box acceleration library for diffusion models.
Language:Jupyter Notebook1.8k 39 463109
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
Language:Python1.4k 41 499
mit-han-lab/smoothquant
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Language:Python1.3k 21 91151
HuangOwen/Awesome-LLM-Compression
Awesome LLM compression research papers and tools.
1.3k 42 683
RahulSChand/gpu_poor
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Language:JavaScript1.2k 7 1660
dafish-ai/NTU-Machine-learning
**大学李宏毅老师机器学习
Language:Jupyter Notebook989 30 0384
lcdevelop/MachineLearningCourse
机器学习精简入门教程
838 95 1261
horseee/DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
Language:Python825 15 5239
OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
Language:Python743 16 8556
pytorch/PiPPy
Pipeline Parallelism for PyTorch
Language:Python730 37 26386
SqueezeAILab/SqueezeLLM
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Language:Python664 18 2843
moon-hotel/BertWithPretrained
An implementation of the BERT model and its related downstream tasks based on the PyTorch framework
Language:Python571 5 21109
3DAgentWorld/Toolkit-for-Prompt-Compression
Toolkit for Prompt Compression
Language:Python244 2 38
nbasyl/LLM-FP4
The official implementation of the EMNLP 2023 paper LLM-FP4
Language:Python169 5 1012
Lisennlp/distributed_train_pytorch
pytorch分布式训练，支持多机多卡，单机多卡。
Language:Python40 1 310
zbwxp/Dynamic-Token-Pruning
Official Pytorch implementation of Dynamic-Token-Pruning (ICCV2023)
Language:Python19 2 33