Experimental data for reproducibility of CIFAR-10 experimental results in the confident learning paper.
The code to generate these Confident Learning CIFAR-10 benchmarking results is available in the cleanlab
Python package, specifically in examples/cifar10
. We used cleanlab v0.1.0 for the original paper.
Because GitHub limits filesizes to 100MB, I cannot upload trained ResNet-50 models (180MB each), but for every setting, I upload an out
log file with the accuracy at every batch and test accuracy at every epoch. The file naming conventions are as follows
out
-- the log files during trainingtrain_mask.npy
-- boolean vector for which examples where pruned during trainingcifar10__train__model_resnet50__pyx.npy
-- Cross-validation out of sample predicted probabilities for CIFAR-10 under the given noisy labels settingscifar10_noisy_labels
-- folder containing all the noisy labels settingsexperiments.bash
-- examples of the commands run to generate resultscifar10_train_crossval.py
-- training script to perform all cifar-10 experiments (get cross-validated probabilities, evaluate on test set, train on a masked input to remove noisy examples)
is available here for download: cifar10/dataset
You can obtain standard (no noise added to label) predicted probabilities here.
These are computed using four-fold cross-validation with a ResNet50 architecture. You can download the out-of-sample predicted probabilities for all training examples in CIFAR-10 for various noise and sparsities settings here:
- Noise: 0% | Sparsity: 0% | [LINK]
- Noise: 20% | Sparsity: 0% | [LINK]
- Noise: 40% | Sparsity: 0% | [LINK]
- Noise: 70% | Sparsity: 0% | [LINK]
- Noise: 20% | Sparsity: 20% | [LINK]
- Noise: 40% | Sparsity: 20% | [LINK]
- Noise: 70% | Sparsity: 20% | [LINK]
- Noise: 20% | Sparsity: 40% | [LINK]
- Noise: 40% | Sparsity: 40% | [LINK]
- Noise: 70% | Sparsity: 40% | [LINK]
- Noise: 20% | Sparsity: 60% | [LINK]
- Noise: 40% | Sparsity: 60% | [LINK]
- Noise: 70% | Sparsity: 60% | [LINK]
Using the psx
predicted probabilities above as input, I used cleanlab
, the Python package that implements confident learning, to compute the label errors for every confident learning method in the CL paper, for every noise and sparsity setting. The outputs are boolean numpy arrays. They are ordered in the same order as the examples when loaded using torch.utils.data.dataloader
. The PyTorch-prepared CIFAR dataset is available here for download: cifar10/dataset
. If you load this dataset in PyTorch, indices will match exactly with the label error masks below.
Column headers are formatted as: <sparsity * 10>_<noise * 10>.
METHOD | 0_2 | 2_2 | 4_2 | 6_2 | 0_4 | 2_4 | 4_4 | 6_4 | 0_7 | 2_7 | 4_7 | 6_7 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
C_confusion | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK |
LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | |
CL: PBC | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK |
CL: PBNR | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK |
CL: C+NR | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK | LINK |
Copyright (c) 2017-2020 Curtis Northcutt. Released under the MIT License. See LICENSE for details.