RobustBench/robustbench

Incorrect preprocessing for ImageNet-C evaluation

imrahulr opened this issue · 2 comments

I see that the ImageNet-C evaluation uses the preprocessing: Resize(256)+CenterCrop(224)+ToTensor().

def load_imagenetc(
n_examples: Optional[int] = 5000,
severity: int = 5,
data_dir: str = './data',
shuffle: bool = False,
corruptions: Sequence[str] = CORRUPTIONS,
prepr: str = 'Res256Crop224'
) -> Tuple[torch.Tensor, torch.Tensor]:
transforms_test = PREPROCESSINGS[prepr]

This causes discrepancies with the scores reported in the original papers (DeepAugment, AugMix, Standard RN-50). The ImageNet-C dataset already contains 224x224 images and hence only ToTensor() should be used for consistency.

Fixing prepr='none' in load_imagenetc should solve the issue (assuming all the models are capable of handling 224x224 images as input).

The suggestion of fixing prepr='none' in load_imagenetc definitely makes sense for the current models! We'll look into this!

Sorry, took us a long time. Should be fixed with this PR: #85.