horseee/LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.

PythonApache-2.0

Issues

Llama3 reports shape error after pruning
#69 opened a month ago by WentaoTan
5
evaluate PPL with the post-training model
#79 opened 11 days ago by VincentZ-2020
0
I would like to ask if the current version is suitable for qwen.
#66 opened 2 months ago by wangxiaoxue
3
No such file or directory: pytorch_model.bin
#74 opened a month ago by yaolu-zjut
2
关于consecutive_groups
#78 opened 16 days ago by VincentZ-2020
0
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1
#77 opened 20 days ago by BrownTan
4
延迟评估
#50 opened 8 months ago by qxpBlog
4
OSError: Can't load tokenizer for 'baffo32/decapoda-research-llama-7B-hf'.
#47 opened 8 months ago by qxpBlog
2
I tired Mistral 7b model, but I got this issue
#56 opened 6 months ago by TejasLidhure
0
No pytorch_model.bin file in the tune_log/llama_0.2/checkpoint-200 folder
#63 opened 3 months ago by hebowei2000
3
RecursionError: maximum recursion depth exceeded
#53 opened 6 months ago by Zhenyu001225
2
Evaluation：UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
#58 opened 5 months ago by manlenzzz
1
Taylor pruner under-utilizing resources
#76 opened a month ago by marianbasti
0
Creating custom configuration files in hgging face format
#75 opened a month ago by sriyachakravarthy
0
Does it support qwen2?
#71 opened a month ago by yangxue-1
1
Issue: Missing Generation of `pytorch_model.bin` File During Model Tuning
#45 opened 9 months ago by WilliamYi96
5
Can the pruned model be converted into a gguf file？
#73 opened a month ago by pxrgod
0
Custom Model pruning
#72 opened a month ago by saidineshpola
0
ConnectionError: Couldn't reach https://raw.githubusercontent.com/wojzaremba/lstm/master/data/ptb.train.txt (ReadTimeout(ReadTimeoutError("HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Read timed out. (read timeout=100)")))
#48 opened 8 months ago by qxpBlog
2
Loading pruned model for causal llm
#68 opened a month ago by sriyachakravarthy
0
Adaptation of GQA
#64 opened 2 months ago by junzhang-zj
7
请问能裁剪普通的transformer模型吗
#62 opened 3 months ago by SKY072410
0
请问可以支持chatglm3剪枝吗
#61 opened 3 months ago by Franklin-L
0
Difference in Perplexity Values
#60 opened 4 months ago by nikhil-ghosh-berkeley
0
No random seed Settings found in post_training.py
#59 opened 5 months ago by JunKong5
0
Pruning llama3
#57 opened 5 months ago by yinwangsong
0
Is this method implementable on multi-GPUs?
#54 opened 6 months ago by LeonCheng0129
0
How to prune the embedding and lm_head?
#55 opened 6 months ago by L-hongbin
0
Unable to reproduce the results for param_first and param_second in the paper after finetuning.
#52 opened 8 months ago by danyal97
0
剪枝率值的问题
#51 opened 8 months ago by qxpBlog
0
a post-training issue
#35 opened 10 months ago by cmnfriend
2
The quantization of the compressed models
#49 opened 8 months ago by lihuang258
0
Cannot use huggface to load
#46 opened 9 months ago by coderchem
0
401 Client Error: Unauthorized for url: https://huggingface.co/decapoda-research/llama-7b-hf/resolve/main/tokenizer_config.json
#43 opened 9 months ago by azuryl
1
cannot import name 'SiLUActivation' from 'transformers.activations'
#44 opened 9 months ago by azuryl
1
Latency code
#33 opened 10 months ago by tuidan
2
Supporting device_map = 'auto' similar to the one in .from_pretrained method from Huggingface
#36 opened 10 months ago by Ahmed-Roushdy
3
the new pytorch.bin is bigger than original model issue
#37 opened 10 months ago by lb553024300
4
Question related to the model tuning
#39 opened 10 months ago by shawnricecake
2
Adding a tutorial for adapting new models?
#42 opened 10 months ago by zhichaoxu-shufe
0
在将部分层进行剪枝之后，不能直接通过tgi加载模型
#41 opened 10 months ago by coderchem
0
Pruning MQA?
#40 opened 10 months ago by jianyuheng
0
为什么num_examples默认是10？
#38 opened 10 months ago by coderchem
2
Reproducing paper results
#34 opened 10 months ago by grigorn
6
Can not import LlamaConfig
#32 opened a year ago by Ahmed-Roushdy
1
Examples on the Huggingface Hub
#31 opened a year ago by vgoklani
0
When will you support ChatGLM?
#30 opened a year ago by AboveParadise
0
Force even pruning across layers
#29 opened a year ago by thedarkzeno
1
Calculating Importance of 'param_mix'
#28 opened a year ago by kiucho
2
When would the code for GPT-J-6B be released?
#27 opened a year ago by mumuyeye
1