Res2NeXt on Cifar100

Question

Res2NeXt on Cifar100

qiangwang57 opened this issue 4 years ago · 13 comments

qiangwang57 commented 4 years ago

Hi @gasvn ,

Thanks for the brilliant work!

I have a couple of simple questions regarding Res2NeXt on Cifar100.

The implementation for ImageNet used the block without hierarchical addition for downsampling, but the code you mentioned in other issue threads (https://gist.github.com/gasvn/cd7653ef93fb147be05f1ae4abad6589) used group convolutions as the first block at each stage for downsampling instead. I wonder which one is the correct one?
Did you use batch size 256 or 128 for the training? I saw your init LR was set to 0.05, which was used by ResNeXt for batch size 256.

Best wishes,

Qiang

Answer 1 · 2020-06-23T01:31:58.000Z

What's your reproduced number? The downsampling module has no hierarchical addition. And use group conv or the form in the res2net for imagenet for the dowmsampling module have similar results on cifar100. I use batch-size of 64 and lr=0.05 on cifar100 without tuning.

Answer 2 · 2020-06-23T01:33:43.000Z

Please let me know if you still cannot reproduce our results.

Answer 3 · 2020-06-23T10:13:05.000Z

Thanks @gasvn for the timely response.

I follow the architecture on ImageNet with batch size 128 and lr 0.1 using 4 GPUs. I managed to reproduce the ResNeXt results, but for Res2NeXt, it is only 80.78.

The only difference I found is the mean and std, where yours are
mean = [0.485, 0.456, 0.406],
std = [0.229, 0.224, 0.225]
which are for ImageNet. I wonder why you choose them for cifar100?

Answer 4 · 2020-06-23T13:34:20.000Z

I didn't notice this when I was training the Res2NeXt. Maybe you can try to use one gpu with batchsize 64 as I did. From my experience, it should not be hard to reproduce the result. I will send you my code once I found it.

Answer 5 · 2020-06-23T15:09:28.000Z

I didn't notice this when I was training the Res2NeXt. Maybe you can try to use one gpu with batchsize 64 as I did. From my experience, it should not be hard to reproduce the result. I will send you my code once I found it.

Thanks @gasvn , I will try it and get back to you with the results.

Answer 6 · 2020-06-26T23:12:03.000Z

Hi @gasvn ,

I have tried different combinations of downsampling block, batch size, lr, # GPUs, and mean and std, but unfortunately, I did not manage to reproduce the results, even close. The best so far is over 18%.

Answer 7 · 2020-07-03T01:37:25.000Z

Have you manage to reproduce our results?

Answer 8 · 2020-07-03T08:17:15.000Z

Unfortunately, no...

…

________________________________ From: Shanghua Gao <notifications@github.com> Sent: 03 July 2020 02:37 To: Res2Net/Res2Net-PretrainedModels <Res2Net-PretrainedModels@noreply.github.com> Cc: Chiang <chiang.wang@outlook.com>; Author <author@noreply.github.com> Subject: Re: [Res2Net/Res2Net-PretrainedModels] Res2NeXt on Cifar100 (#36) Have you manage to reproduce our results? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#36 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AL44YHP7RM7CLIK6FBTSWATRZUY6FANCNFSM4OFF23BQ>.

Answer 9 · 2020-07-03T08:35:47.000Z

I managed to find the code I used for training the res2net on cifar100.
It can reproduce the result of Res2NeXt-29, 6c×24w×4s

BestPrec so far@1 83.020 in epoch 273

https://gist.github.com/gasvn/a1793919427f799e74bb7c900af11d4c

Answer 10 · 2020-07-03T08:58:32.000Z

I managed to find the code I used for training the res2net on cifar100.
It can reproduce the result of Res2NeXt-29, 6c×24w×4s

BestPrec so far@1 83.020 in epoch 273

https://gist.github.com/gasvn/a1793919427f799e74bb7c900af11d4c

Perfect! Thank you very much! I will let you know the results!

I assume the following parameters you used for the training:
batch size: 64
init LR: 0.05
single GPU

Apart from those, anything else I need to pay special attention?

Cheers, Qiang

Answer 11 · 2020-08-04T07:01:27.000Z

Have you manage to reproduce our results? Sorry, there is nothing else that I can help you with.

Answer 12 · 2020-08-08T10:23:58.000Z

请问在stride=2的时候，前后两个尺寸不一样是怎么融合呢，直接相加会不会又问题啊

Answer 13 · 2020-08-11T16:49:11.000Z

Have you manage to reproduce our results? Sorry, there is nothing else that I can help you with.

Unfortunately, I did not manage to reproduce the results, even close ones.

Anyway, really appreciate your help!