[paper]
This is the official repo for our paper "Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels".
Knockoffs-SPR is a theoretically guaranteed sample selection algorithm for learning with noisy labels, which is provable to control the False-Selection-Rate (FSR) in the selected clean data.
python
numpy
scipy
scikit-learn
torch
torchvision
CIFAR-10 and CIFAR-100 can be downloaded using torchvision. The other datasets can be downloaded from the official link: WebVision, Clothing1M.
The datasets are expected to be stored in the folder ../data or specified by the root parameter, and arranged as follows:
│data/
├── CIFAR10/
│ ├── ......
├── CIFAR100/
│ ├── ......
├── webvision/
│ ├── info/
│ │ ├── ......
│ ├── google/
│ │ ├── ......
│ ├── val_images_256/
│ │ ├── ......
├── clothing1m/
│ ├── ......
(Optional)
├── imagenet/
│ ├── meta.mat
│ ├── ILSVRC2012_validation_ground_truth.txt
│ ├── val/
│ │ ├── ......
The pretained models can be downloaded from here and should be put in the folder ckpt.
We first pretrain the backbone network use SimSiam with the following command.
python pretrain_simsiam.py -a $ARCH --dataset $DATASET --num_classes $NUM_CLASS
where $ARCH is the architecture of the backbone network, $DATASET is the dataset name, and $NUM_CLASS is the number of classes in the dataset. Specific running commands are listed in the folder scripts.
Example training commands are listed in the folder scripts. You could try the following commands as a start.
Train on CIFAR10 with different noise setting:
python scripts/train_cifar.py
Example evaluation commands are listed in the folder scripts. You could try the following commands as a start.
Test on CIFAR10 with different noise setting:
python scripts/eval_cifar.py
Thanks to everyone who makes their code and models available. In particular,
For issues using Knockoffs-SPR, please submit a GitHub issue.
If you found the provided code useful, please cite our work.
@article{wang2023knockoffs,
title={Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels},
author={Wang, Yikai and Fu, Yanwei and Sun, Xinwei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2023},
doi={10.1109/TPAMI.2023.3338268},
}