/Neural-Relation-Graph

Official PyTorch implementation of "Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data" (NeurIPS'23)

Primary LanguagePython

Neural-Relation-Graph

Official PyTorch implementation of "Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data", published at NeurIPS'23

Requirements

  • PyTorch 1.11 and timm 0.5.4 (check requirements.txt)
  • Set IMGNET_DIR in ./imagenet/data.py that contains ImageNet train and val directories.
  • We provide main experiment checkpoints, including data features via Google Drive.
  • Install gdown
   pip uninstall --yes gdown
   pip install gdown -U --no-cache-dir

Notes

  • The default save directory is ./results. You can specify a different directory by using the --cache_dir option (for both download and detection).
  • We train MAE-Large model for 50 epochs following official codes.
  • For ResNet50, we use the model trained by Timm.
  • For detection, you can reduce memory usage by half using half precision with --dtype float16, with a marginal performance drop. Also, using smaller --chunk (e.g., 50) reduces the memory usage, while leads to increased computation time.

Label error detection

ImageNet with synthetic label error (8%)

  • Download model, features, noisy labels (6.3GB):
python download.py -n mae_large_noise0.08_49
  • To conduct detection, run
python detect.py -n mae_large_noise0.08_49 --pow 4
  • The required GPU Memory is approximately 14GB. You can reduce memory usage by half using half precision with --dtype float16, with a marginal performance drop.

ImageNet validation set cleaning

  • Download model and features (6.4GB):
python download.py -n mae_large_49
  • To conduct detection, run
python detect_val.py -n mae_large_49 --pow 4

OOD detection

  • Download OOD datasets following this link.
  • Set OOD_DIR in ./imagenet/data.py (to contain dtd, iNaturalist, Places, SUN folders).
  • Download model and features (6.4GB for MAE-Large / 11GB for ResNet50):
python download.py -n [mae_large_49/resnet50]
  • To conduct OOD detection, run
python detect_ood.py -n [mae_large_49/resnet50] --pow 1
  • The required GPU Memory is approximately 14GB for MAE-Large and 18GB for ResNet50. You can reduce memory usage by half using half precision with --dtype float16, with a marginal performance drop.

Language and speech datasets

Citation

@article{kim2023neural,
  title={Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data},
  author={Kim, Jang-Hyun and Yun, Sangdoo and Song, Hyun Oh},
  journal={Advances in Neural Information Processing Systems},
  year={2023}
}