jiajunsun68's Stars
binary-husky/gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
microsoft/BitNet
Official inference framework for 1-bit LLMs
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
microsoft/LLMLingua
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
IST-DASLab/gptq
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
HuangOwen/Awesome-LLM-Compression
Awesome LLM compression research papers and tools.
Vahe1994/AQLM
Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression https://arxiv.org/abs/2405.14852
luanshiyinyang/FacialExpressionRecognition
人脸识别之表情识别项目相关源码
locuslab/wanda
A simple and effective LLM pruning approach.
microsoft/TransformerCompression
For releasing code related to compression methods for transformers, accompanying our publications
NVlabs/Taylor_pruning
Pruning Neural Networks with Taylor criterion in Pytorch
Aaronhuang-778/BiLLM
(ICML 2024) BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
hahnyuan/PB-LLM
PB-LLM: Partially Binarized Large Language Models
AIoT-MLSys-Lab/SVD-LLM
Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"
VILA-Lab/GBLM-Pruner
Are gradient information useful for pruning of LLMs?
cjyaras/deep-lora-transformers
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation (ICML'24 Oral)
Dereck0602/Bolaco
saintslab/bmrs-structured-pruning
Code release for the paper "BMRS: Bayesian Model Reduction for Structured Pruning"