locuslab/wanda

A simple and effective LLM pruning approach.

PythonMIT

Issues

Error when trying to run Llama2-7b: attention_mask and position_ids are None
#60 opened a month ago by Philippe-Guyard
0
Breaks under Numpy 2
#58 opened a month ago by time-less-ness
0
Llama2 `ExpectedMoreSplits` Exception
#57 opened a month ago by time-less-ness
0
How to save the pruned model?
#56 opened 2 months ago by botox-100
0
which dataset do you use for jailbreak?
#55 opened 2 months ago by lucywang720
0
why choosing vicuna as the tokenizer
#53 opened 2 months ago by hainingfang
2
Ambiguous result for LLAMA-2-13b
#54 opened 2 months ago by Qian2333
3
Cannot reproduce Llama2 results
#52 opened 3 months ago by taratt
5
question about the code in the sparse_trainer.py
#49 opened 3 months ago by hainingfang
2
AttributeError: 'NoneType' object has no attribute 'to'
#51 opened 3 months ago by kast424
2
Compressing a Finetuned llama2 model with lora
#50 opened 3 months ago by bkhanal-11
1
llama_7b wikitext perplexity 7.0915350914
#45 opened 3 months ago by xiaopengaia
3
gpu memory size recommended for pruning the llama2-7b-chat-hf model
#44 opened 3 months ago by rsong0606
3
Perplexity is off for Llama 2-7b
#47 opened 3 months ago by taratt
4
Where is cache['attention_mask']?
#48 opened 3 months ago by JohnneyQin
1
OPT-66B, unstructured sparsity gets wikitext perplexity 3404.0751953125
#46 opened 3 months ago by dhjoo98
1
Support for LLaMA-2
#23 opened 10 months ago by junzhang-zj
16
Change sparsity rates
#42 opened 3 months ago by Lexlum
0
Need Clarification regarding prune_deit() in - https://github.com/locuslab/wanda/blob/main/image_classifiers/main.py
#41 opened 3 months ago by solomonmanuelraj
0
Wanda Prunning for Zero Shot Object Detection Model - google/owlv2-base-patch16-ensemble
#40 opened 3 months ago by solomonmanuelraj
0
pruned model load slowly
#28 opened 8 months ago by kfchenhn
1
Pruning image classification models (mlp_mixer)
#39 opened 4 months ago by mammadmaheri7
1
Some question about the code
#38 opened 5 months ago by liuxiaozhu01
2
Can this be used for pruning Whisper?
#37 opened 5 months ago by caroljoyv
0
run 70b error:RuntimeError: shape '[1, 4096, 64, 128]' is invalid for input of size 4194304
#33 opened 7 months ago by JiaQuan1203
2
Cannot load the c4 dataset
#36 opened 6 months ago by simlaharma
1
Issue with Mixed Device Tensors (cuda:0 and cuda:1)
#35 opened 6 months ago by pprp
1
How to fine tune unstructured sparse models with LoRA?
#10 opened a year ago by an-yongqi
5
Can wanda be used to ConvNet for CV tasks?
#32 opened 7 months ago by frankinwi
2
HellaSwag numbers?
#31 opened 7 months ago by forresti
2
Fine-tune the pruned model
#24 opened 7 months ago by guozhiyu
1
Pruned model is same size as original
#29 opened 8 months ago by virentakia
2
Questions about sub-networks of LLMs
#27 opened 8 months ago by JiwenJ
2
Publish the Llama2 sparsified models
#30 opened 8 months ago by egeor
4
Question about the latency speedup!
#26 opened 9 months ago by ybai62868
4
error in loading datasets
#25 opened 9 months ago by Ahmed-Roushdy
1
Request for Code Related to Zero-shot Task Evaluation Results in Table 3
#16 opened a year ago by RealAidanJoe
3
Can wanda speedup LLM inference performance?
#14 opened 10 months ago by ifromeast
1
What are the dependency of this project? What's the version of transformers?
#17 opened 10 months ago by guanchuwang
1
is it possible to prune gptq models?
#11 opened a year ago by GrailFinder
3
How to use sparseGPT to prune the output dimension?
#15 opened 10 months ago by wfan1203
3
LoRA fine-tuning hyper-parameters
#8 opened 10 months ago by ryusaeba
7
calibration data seq_length
#22 opened a year ago by kiucho
1
Some questions about the codes.
#21 opened a year ago by kiseliu
2
can not reproduce the results of figure 3
#19 opened a year ago by HaihangWu
2
Structured 2:4 sparsity pattern supporting GPU
#20 opened a year ago by kiucho
2
Is this work for other models, such as GPT-2?
#18 opened a year ago by anhdang000
1
Some questions.
#12 opened a year ago by nikitabalakin
3
Falcon 7b / 40b
#13 opened a year ago by BaiqingL
1
How can the pruned model with sparse matrix save model size and computation cost?
#9 opened a year ago by JiachuanDENG
1