sorry，I can't use the structure shown in the paper to get the same result on cifar10 dataset

Question

sorry，I can't use the structure shown in the paper to get the same result on cifar10 dataset

Closed this issue 3 months ago · 7 comments

Answer 1 · 2024-06-15T11:33:15.000Z

Could you describe your issue in more detail? Is there a specific structure that hasn’t achieved the accuracy reported in the paper?

Answer 2 · 2024-06-15T12:18:26.000Z

Sure, I choose the structure shown in tabel Ⅱ ，I think they are lightweight model from standard Resnet. But you don't put them on Github's code repository so I reproduced them by reducing the number of channels in your standard ResNet model. And I wanna confirm one thing about the experiment of CIFAR10 datas.。In tabel Ⅷ，you just use randomcrop and normalize for tabel Ⅱ model. But my experimental results differ from yours by approximately 5 percentage points. Thank you for answering my questions.

Answer 3 · 2024-06-16T12:08:30.000Z

Oh, I noticed that the table you referenced regarding hyperparameters should come from the version we uploaded to arXiv. Strictly following the hyperparameters in that version will indeed result in an accuracy of 85-86% for SNN-ResNet20 in my reproduction as well.

The transformations used for Table II are RandomCrop, RandomHorizontalFlip, and normalization, which has been modified in our TNNLS version. Additionally, I recommend setting the weight decay to 1e-4. This should lead to more satisfactory results. Apologies for any inconvenience caused.

Answer 4 · 2024-06-17T04:48:47.000Z

Thank you for the provided explanation. However, I am still unable to achieve 85% accuracy using the previous hyperparameters. Could you please provide the model parameters for ResNet20 that you used, or share the relevant code?

Answer 5 · 2024-06-17T07:55:28.000Z

Sure. This is the relevant training code and pre-trained weights of Resnet20, with an accuracy of 88.38%. link

Answer 6 · 2024-06-18T04:28:48.000Z

Great！Thank you very much for your assistance. My experiment indeed achieved considerable accuracy. I have one more question regarding the Batch Normalization layer settings. What is the reason for initializing BatchNorm3d2 to 0.2*thresh?

Answer 7 · 2024-06-18T06:14:24.000Z

Initializing this affine parameter close to zero will give the model a clean and branch-less starting point (only shortcut) and thus provide a faster convergence. You may refer to the SectionV.B of our paper for more details.