This is implementation of NIPS 2017 Paper Selective Classification For Deep Neural Networks as a part of NIPS Global Paper Implementation Challenge
- Python 3.6
- Keras
- Tensoflow
- ImageNet validation dataset can be downloaded from here.
- ILSVRC2012_validation_ground_truth.txt contains ground truth labels for ImageNet validation dataset.
- imagenet-classes-dict.dat is a pickle dictionary, if you input a class you get a number from 1 to 1000 corresponding to the ground truth in the ILSVRC2012_validation_ground_truth.txt file.
- The weights of the model trained as suggested in paper on CIFAR-10 and CIFAR-100 datasets can be downloaded from CIFAR-10 WEIGHTS(93.67% accuracy)and CIFAR-100 WEIGHTS(70.52% accuracy).
- Evaluating on CIFAR-10 dataset.
python eval/cifar10_vgg16.py
- Evaluating on CIFAR-100 dataset.
python eval/cifar100_vgg16.py
- Evaluating on ImageNet validation dataset using VGG16 top1.
python eval/vgg16_top1.py
- Evaluating on ImageNet validation dataset using VGG16 top5.
python eval/vgg16_top5.py
- Evaluating on ImageNet validation dataset using ResNet50 top1.
python eval/resnet50_top1.py
- Evaluating on ImageNet validation dataset using ResNet50 top5.
python eval/resnet50_top5.py
Desired Risk |
Train Risk |
Train Coverage |
Test Risk |
Test Coverage |
Risk Bound |
0.01 |
0.0039 |
0.7044 |
0.0046 |
0.6964 |
0.0093 |
0.02 |
0.0121 |
0.8410 |
0.0140 |
0.8376 |
0.0199 |
0.03 |
0.0207 |
0.8896 |
0.0226 |
0.8868 |
0.0299 |
0.04 |
0.0294 |
0.9198 |
0.0293 |
0.9200 |
0.0399 |
0.05 |
0.0382 |
0.9482 |
0.0388 |
0.9492 |
0.0498 |
0.06 |
0.0473 |
0.9688 |
0.0477 |
0.9728 |
0.0599 |
- On CIFAR-100 dataset using VGG16.
Desired Risk |
Train Risk |
Train Coverage |
Test Risk |
Test Coverage |
Risk Bound |
0.02 |
0.0031 |
0.1288 |
0.0074 |
0.1354 |
0.0185 |
0.05 |
0.0319 |
0.4012 |
0.0344 |
0.4016 |
0.0488 |
0.10 |
0.0792 |
0.5584 |
0.0821 |
0.5646 |
0.0099 |
0.15 |
0.1268 |
0.6642 |
0.1279 |
0.6734 |
0.0149 |
0.20 |
0.1756 |
0.7698 |
0.1746 |
0.7672 |
0.0199 |
0.25 |
0.2253 |
0.8692 |
0.2263 |
0.8704 |
0.2499 |
- On ImageNet Validation dataset using VGG16 Top1.
Desired Risk |
Train Risk |
Train Coverage |
Test Risk |
Test Coverage |
Risk Bound |
0.02 |
0.0118 |
0.1619 |
0.1011 |
0.1582 |
0.0198 |
0.05 |
0.0418 |
0.4084 |
0.0429 |
0.4052 |
0.0498 |
0.10 |
0.0904 |
0.5608 |
0.0926 |
0.5660 |
0.0999 |
0.15 |
0.1395 |
0.6741 |
0.1373 |
0.6762 |
0.1499 |
0.20 |
0.1891 |
0.7762 |
0.1855 |
0.7817 |
0.1999 |
0.25 |
0.2388 |
0.8736 |
0.2337 |
0.8770 |
0.2499 |
- On ImageNet Validation dataset using VGG16 Top5.
Desired Risk |
Train Risk |
Train Coverage |
Test Risk |
Test Coverage |
Risk Bound |
0.01 |
0.0055 |
0.2556 |
0.0071 |
0.2534 |
0.0099 |
0.02 |
0.0152 |
0.4798 |
0.0176 |
0.4823 |
0.0199 |
0.03 |
0.0247 |
0.5870 |
0.0254 |
0.5929 |
0.0299 |
0.04 |
0.0343 |
0.6763 |
0.0341 |
0.6785 |
0.0399 |
0.05 |
0.0440 |
0.7589 |
0.0414 |
0.7646 |
0.0499 |
0.06 |
0.0537 |
0.8148 |
0.0521 |
0.8196 |
0.0599 |
0.07 |
0.0634 |
0.8654 |
0.0622 |
0.8681 |
0.0699 |
- On ImageNet Validation dataset using ResNet50 Top1.
Desired Risk |
Train Risk |
Train Coverage |
Test Risk |
Test Coverage |
Risk Bound |
0.02 |
0.0122 |
0.1733 |
0.0114 |
0.1722 |
0.0199 |
0.05 |
0.0422 |
0.4461 |
0.0455 |
0.4425 |
0.0499 |
0.10 |
0.0908 |
0.6141 |
0.0903 |
0.6156 |
0.0999 |
0.15 |
0.1399 |
0.7336 |
0.1374 |
0.7328 |
0.1499 |
0.20 |
0.1895 |
0.8438 |
0.1901 |
0.8458 |
0.1999 |
0.25 |
0.2392 |
0.9381 |
0.2389 |
0.9386 |
0.2499 |
- On ImageNet Validation dataset using ResNet50 Top5.
Desired Risk |
Train Risk |
Train Coverage |
Test Risk |
Test Coverage |
Risk Bound |
0.01 |
0.0053 |
0.2398 |
0.0062 |
0.2374 |
0.0999 |
0.02 |
0.0153 |
0.4965 |
0.0156 |
0.4984 |
0.0199 |
0.03 |
0.0249 |
0.6306 |
0.0236 |
0.6324 |
0.0299 |
0.04 |
0.0346 |
0.7374 |
0.0321 |
0.7370 |
0.0399 |
0.05 |
0.0442 |
0.8138 |
0.0408 |
0.8153 |
0.0499 |
0.06 |
0.0539 |
0.8710 |
0.0501 |
0.8714 |
0.0599 |
0.07 |
0.0636 |
0.9205 |
0.0622 |
0.9223 |
0.0699 |
- Achieved 60% test coverage guaranteed with 99.9% probability at 3% error rate top-5 ImageNet classification.