PyTorch Implementation of Mixup
$ python main.py --block_type basic --depth 110 --use_mixup --mixup_alpha 1 --outdir results
Model |
Test Error (1 run) |
ResNet-preact-56 w/ Mixup alpha = 0.5 (160 epochs) |
5.55 |
ResNet-preact-56 w/ Mixup alpha = 1 (160 epochs) |
5.62 (median of 3 runs) |
ResNet-preact-56 w/ Mixup alpha = 2 (160 epochs) |
6.14 |
ResNet-preact-56 w/ Mixup alpha = 1 (300 epochs) |
5.11 (median of 5 runs) |
ResNet-preact-110 w/ Mixup alpha = 1 (300 epochs) |
4.26 |
Model |
Test Error (median of 5 runs) |
Training Time |
ResNet-preact-56 w/o Mixup (160 epochs) |
5.85 |
98 min |
ResNet-preact-56 w/ Mixup (300 epochs) |
5.11 |
191 min |
$ python -u main.py --depth 56 --block_type basic --base_lr 0.2 --epochs 160 --milestones '[80, 120]' --seed 7 --outdir results/wo_mixup/00
$ python -u main.py --depth 56 --block_type basic --base_lr 0.2 --use_mixup --mixup_alpha 1 --epochs 300 --milestones '[150, 225]' --seed 7 --outdir results/w_mixup/00
- Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz. "mixup: Beyond Empirical Risk Minimization." arXiv preprint arXiv:1710.09412. arXiv:1710.09412