This repo contains a framework to execute a number of published and unpublished out of distribution detection methods on both a novel benchmark introduced by this work and on the more established benchmark from MOS.
The implemented OOD detection methods can be divided into 2 groups:
- methods requiring a finetuning on closed set data;
- finetuning-free methods, which simply compare support data (the closed set samples the others use for finetuning) and test data to provide normality scores for the latter.
Therefore, the included code can be used to either finetune a pretrained model on a specific OOD detection task, before evaluating it, or to directly evaluate a pretrained model.
The dependencies are listed in the requirements.txt
file.
We downloaded the public models for resnet101 versions of SimCLR, SupCon, CSI, SupCSI from
the original repositories and converted them to make them compatible with our framework.
Converted models can be found
here.
Downloaded models should be put in the pretrained_models
directory.
For our benchmark, refer to the package setup guide.
NOTE: in this case the path_dataset
argument is ignored and sorts no effect.
In order to evaluate a model on this benchmark you will need the following datasets, organized as shown:
- ImageNet-1K:
- train split:
ILSVRC/Data/CLS-LOC/train/nXXXXXXXX/nXXXXXXXX_<img_id>.JPEG
- val split:
ILSVRC/Data/CLS-LOC/val/nXXXXXXXX/ILSVRC2012_val_<img_id>.JPEG
- train split:
- iNaturalist:
iNaturalist/images/<img_id>.jpg
- SUN:
SUN/images/sun_<img_id>.jpg
- Places:
Places/images/<cls_first_letter>_<cls_name>_<img_id>.jpg
- DTD (Texture):
dtd/images/<cls_name>/<cls_name>_<img_id>.jpg
For the specific file paths utilized, refer to the txt file lists under data/txt_lists/mos_bench
.
The datasets' location on disk can be specified at runtime through the path_dataset
argument (by default ~/data
is used).
The pretrained models considered in this work are:
- ImageNet-1K pretrainings
- ReSeND
- ResNet-101, CE
- ResNet-101, SimCLR
- ResNet-101, SupCon
- ResNet-101, CSI
- ResNet-101, SupCSI
- ViT-B, CE
- ViT-B, DINO
- Larger pretrainings
- Vit-L, CE (ImageNet-21K)
- ResNetV2-101x3 BiT, CE (ImageNet-21K)
- ViT-L, CLIP (CLIP)
- ViT-L, DINOv2 (LVD-142M)
The supported evaluation methods are:
- Finetuning-free
- Prototypes-based distance
- k-NN (both with and without feature normalization)
- Mahalanobis
- ReSeND (ReSeND only)
- MCM (CLIP only)
- Finetuning only
- MSP
- ReAct
- ASH
- Normalizing flow
To reproduce the paper's main results, you can simply run the bash scripts contained in the scripts
folder (in case of table_6.sh
you may want to adjust the path pointing to your finetuned models, refer to the first line of the script).
Below we report some extra details regarding the supported execution arguments.
In order to evaluate a model, the generic command to run is the following:
python train.py --only_eval \
--network <network> --model <model> --checkpoint_path <ckpt_path> \
--evaluator <evaluator> \
--dataset <dataset> --support <support> --test <test> --data_order <order>
To select the pretrained model, choose the appropriate network
and model
(and, if specified, checkpoint_path
) values.
The following combinations are supported:
- ReSeND:
--network resend --model resend --checkpoint_path "pretrained_models/resend.pth"
- ResNet-101, CE (ImageNet-1K):
--network resnet101 --model CE
- ResNet-101, SimCLR (ImageNet-1K):
--network resnet101 --model simclr --checkpoint_path "pretrained_models/simclr_r101.pth"
- ResNet-101, SupCon (ImageNet-1K):
--network resnet101 --model supclr --checkpoint_path "pretrained_models/supclr_r101.pth"
- ResNet-101, CSI (ImageNet-1K):
--network resnet101 --model CSI --checkpoint_path "pretrained_models/CSI_r101.pth"
- ResNet-101, SupCSI (ImageNet-1K):
--network resnet101 --model supCSI --checkpoint_path "pretrained_models/supCSI_r101.pth"
- ViT-B, CE (ImageNet-1K):
--network vit --model CE
- ViT-B, DINO (ImageNet-1K):
--network vit --model DINO --checkpoint_path "pretrained_models/DINO_vitb.pth"
- ViT-L, CE (ImageNet-21K):
--network vit --model CE-IN21k
- BiT ResNetV2-101x3, CE (ImageNet-21K):
--network resnetv2_101x3 --model BiT
- ViT-L, CLIP:
--network vit --model clip
- ViT-L, DINOv2 (LVD-142M):
--network vit --model DINOv2
To select the evaluation method, choose the appropriate evaluator
value:
- Prototypes distance:
--evaluator prototypes_distance
- k-NN:
--evaluator knn_distance
- k-NN with normalized features:
--evaluator knn_ood
- Mahalanobis:
--evaluator mahalanobis
- ReSeND (ReSeND only):
--evaluator resend
- MCM (CLIP only):
--evaluator MCM
- MSP (finetuning only):
--evaluator MSP
- ReAct (finetuning only):
--evaluator react
- ASH (finetuning only):
--evaluator ASH
- Normalizing flow (finetuning only, requires
--nf_head
):--evaluator flow
- To run our benchmark, choose a
dataset
value amongdomainnet
,dtd
,patternnet
,stanford_cars
,sun
. As for thesupport
andtest
arguments, specify a validsplit
value as described in the documentation. For this track,data_order
must assume a value between0
and2
(inclusive - for more details, refer to the documentation). - To evaluate a model on the benchmark from MOS, choose
--dataset mos_bench
and--support imagenet
. As for thetest
argument, you can select one amonginaturalist
,sun
,places
,dtd
(texture). For this track, thedata_order
arg is unused and it must be left at its default-1
value.
To perform a finetuning operation, omit the only_eval
argument (in case of the flow
evaluator, you must also pass the --nf_head
argument, since an additional flow head must be trained). By default the finetuning operation will cover 25 epochs (5 warmup ones and 20 cosine annealing ones, as specified in the paper).
If you find this code useful, please cite our paper:
@inproceedings{cappio2022relationalreasoning,
title={Semantic Novelty Detection via Relational Reasoning},
author={Francesco Cappio Borlino, Silvia Bucci, Tatiana Tommasi},
booktitle={European Conference on Computer Vision (ECCV)},
year={2022}
}