Questions about implementation detail

Question

Questions about implementation detail

Closed this issue 2 years ago · 3 comments

hello , I have some questiones about implementation details.

Data are obtained using the HR-LR data pairs obtained by the down-sampling code provided in BasicSR. The training data was DF2K (900 DIV2K + 2650 Flickr2K), and the test data was Set5.

I run this command to prune the EDSR_16_256 model to EDSR_16_48. Only the pruning ratio and storage path name are modified compared to the command provided by the official.

Prune from 256 to 48, pr=0.8125, x2, ASSL

python main.py --model LEDSR --scale 2 --patch_size 96 --ext sep --dir_data /home/notebook/data/group_cpfs/wurongyuan/data/data
--data_train DF2K --data_test DF2K --data_range 1-3550/3551-3555 --chop --save_results --n_resblocks 16 --n_feats 256
--method ASSL --wn --stage_pr [0-1000:0.8125] --skip_layers *mean*,*tail*
--same_pruned_wg_layers model.head.0,model.body.16,*body.2 --reg_upper_limit 0.5 --reg_granularity_prune 0.0001
--update_reg_interval 20 --stabilize_reg_interval 43150 --pre_train pretrained_models/LEDSR_F256R16BIX2_DF2K_M311.pt
--same_pruned_wg_criterion reg --save main/SR/LEDSR_F256R16BIX2_DF2K_ASSL_0.8125_RGP0.0001_RUL0.5_Pretrain_06011101
Results
model_just_finished_prune ---> 33.739dB
fine-tuning after one epoch ---> 37.781dB
fine-tuning after 756 epoch ---> 37.940dB

The result (37.940dB) I obtained with the code provided by the official is still a certain gap from the result in the paper (38.12dB). I should have overlooked some details.

I also compared L1-norm method provided in the code.
Prune from 256 to 48, pr=0.8125, x2, L1

python main.py --model LEDSR --scale 2 --patch_size 96 --ext sep --dir_data /home/notebook/data/group_cpfs/wurongyuan/data/data
--data_train DF2K --data_test DF2K --data_range 1-3550/3551-3555 --chop --save_results --n_resblocks 16 --n_feats 256
--method L1 --wn --stage_pr [0-1000:0.8125] --skip_layers *mean*,*tail*
--same_pruned_wg_layers model.head.0,model.body.16,*body.2 --reg_upper_limit 0.5 --reg_granularity_prune 0.0001
--update_reg_interval 20 --stabilize_reg_interval 43150 --pre_train pretrained_models/LEDSR_F256R16BIX2_DF2K_M311.pt
--same_pruned_wg_criterion reg --save main/SR/LEDSR_F256R16BIX2_DF2K_L1_0.8125_06011101

Results

model_just_finished_prune ---> 13.427dB
fine-tuning after one epoch ---> 33.202dB
fine-tuning after 756 epoch ---> 37.933dB

The difference between the results of L1-norm method and those of ASSL seems negligible at this pruning ratio (256->48)

Is there something I missed? Looking forward to your reply! >-<

Answer 1 · 2022-06-07T12:50:53.000Z

I got some guidance from the author on the details, and I go on with the experiment.
Thanks very much !!!

Answer 2 · 2022-08-02T07:02:37.000Z

I got some guidance from the author on the details, and I go on with the experiment. Thanks very much !!!

Hello, I ran into the same issue, where the results between ASSL and L1-pruning are quite similar. I wonder how did you solve this and if the ASSL result distinguished itself, thanks in advance!

Answer 3 · 2023-04-11T12:29:37.000Z

I got some guidance from the author on the details, and I go on with the experiment. Thanks very much !!!

Hello, I ran into the same issue, where the results between ASSL and L1-pruning are quite similar. I wonder how did you solve this and if the ASSL result distinguished itself, thanks in advance!

@wurongyuan @MingSun-Tse Hello，I also have same question. And I find that the result of ASSL and L1 pruning is similar to EDSR_16_49 model training from scratch, is there something wrong in initialization or params copy?