tensorflow/tpu

Clarification of RandAugment max value

rwightman opened this issue · 3 comments

The RandAugment paper mentions to maximum magnitude values, 30 in the paper body and 20 in the appendix. In the implementation in this repository _MAX_VALUE is 10. The magnitudes are all scaled by that maximum with no clipping, so you could input a higher value. For some transforms it would result in a sensible output, others would result in extreme or nonsensical values with an input of 20, or 30.

Does the paper reference an impl like this where the level scaling denominator is 10 regardless of the magnitude range? Or would each impl having ranges of 10, 20, or 30 respectively have a matching denominator and simply provide more granularity?

Another question regarding RandAugment. For most of the augmentations, the 'intensity' of augmentation increases with magnitude, so picking a larger magnitude means more translation, more rotation, more shear, etc. However, for the implementation of Solarize and Posterize here this is not the case. A maximal magnitude for Solarize results in no change, a 0 magnitude completely inverts the image. For Posterize, a maximal magnitude keeps 4 of the image bits, a 0 magnitude keeps none (black image).

The above didn't matter for AutoAugment, but for RandAugment shouldn't the intensity/severity of all augmentations move in the same direction?

@BarretZoph I have similar questions as above. And it really confuses us. Could you please figure out our questions? Thanks.

Hello thank you for the questions

We use a denominator of 10 across no matter what the magnitude value is (10, 20, 30, etc...).

No we actually had in mind making all operations be "more extreme" as the magnitude increased, but with a few ops this was not the case (e.g. posterize). We have since fixed this and have not noticed a big difference in performance.