google-research/augmix

There is a big gap between the results of the code and the results in the paper on CIFAR10-C

LinusWu opened this issue · 2 comments

I run this script several times to test the performance of proposed method on CIFAR10-C and CIFAR100-C:
python cifar.py -m resnext -e 200

10.9% error rate of CIFAR10-C is shown in the paper, however I got about 29% error rate.
Then, I tried different models, but there is still a big gap between the results of this code and the results in the paper on CIFAR10-C.

But the results on the CIFAR100-C are close to the paper.
BTW, I did not modify the code.

I would like to know:

  1. Is the hyperparameter setting of cifar10 incorrectly given?
  2. Are there more implementation details?

or could you please give some explanation?

THANKS A LOT!

Do you have the same problem with ResNet? I'm just wanting to get a sense of whether the problem is consistent. If so, I'll try to look into it.

I trained CIFAR-10 on WRN and ResNeXt and got results close to the paper's results without touching the code.
WRN:

gaussian_noise
	Test Loss 0.001 | Test Error 18.808
shot_noise
	Test Loss 0.001 | Test Error 13.826
impulse_noise
	Test Loss 0.001 | Test Error 13.894
defocus_blur
	Test Loss 0.000 | Test Error 5.764
glass_blur
	Test Loss 0.001 | Test Error 18.914
motion_blur
	Test Loss 0.000 | Test Error 7.640
zoom_blur
	Test Loss 0.000 | Test Error 7.014
snow
	Test Loss 0.000 | Test Error 10.082
frost
	Test Loss 0.000 | Test Error 10.144
fog
	Test Loss 0.000 | Test Error 8.036
brightness
	Test Loss 0.000 | Test Error 5.594
contrast
	Test Loss 0.000 | Test Error 9.310
elastic_transform
	Test Loss 0.000 | Test Error 9.646
pixelate
	Test Loss 0.001 | Test Error 14.404
jpeg_compression
	Test Loss 0.000 | Test Error 12.446
Mean Corruption Error: 11.035

ResNeXt:

gaussian_noise
	Test Loss 0.001 | Test Error 25.776
shot_noise
	Test Loss 0.001 | Test Error 18.142
impulse_noise
	Test Loss 0.000 | Test Error 12.198
defocus_blur
	Test Loss 0.000 | Test Error 5.412
glass_blur
	Test Loss 0.001 | Test Error 19.578
motion_blur
	Test Loss 0.000 | Test Error 7.164
zoom_blur
	Test Loss 0.000 | Test Error 6.156
snow
	Test Loss 0.000 | Test Error 10.586
frost
	Test Loss 0.000 | Test Error 10.606
fog
	Test Loss 0.000 | Test Error 8.818
brightness
	Test Loss 0.000 | Test Error 5.130
contrast
	Test Loss 0.000 | Test Error 8.870
elastic_transform
	Test Loss 0.000 | Test Error 9.094
pixelate
	Test Loss 0.000 | Test Error 13.304
jpeg_compression
	Test Loss 0.000 | Test Error 11.392
Mean Corruption Error: 11.482