About ImageNet-100

Question

About ImageNet-100

shuaiNJU opened this issue 2 years ago · 5 comments

Hi,
can you provide the download link of ImageNet-100 datasets, or the code for how to randomly select imagenet-100 from imagenet-1k? Thanks a lot!

Answer 1 · 2023-04-16T04:28:12.000Z

Sure! You can use the following script to create a subset ImageNet-K (e.g. K = 100) from ImageNet-1k. Just replace the src-dir with your path to ImageNet-1k.

import os
import shutil
from tqdm import tqdm
import argparse
import random

parser = argparse.ArgumentParser(description='Create ImageNet-100 subset',
                                 formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('--K', default=100, type=int, help='num of classes to be subsampled')
parser.add_argument('--src-dir', default='/path/to/imagenet-1k', type=str,
                    help='path to ImageNet-1k')
                    # '/path/to/ImageNet-1k'
parser.add_argument('--dst-dir', default='datasets/imagenet-100', type=str,
                    help='root dir of in_dataset')
args = parser.parse_args()
os.makedirs(args.dst_dir, exist_ok=True)

#subsample K classes from ImageNet-1k
class_names = random.sample(os.listdir(os.path.join(args.src_dir, 'train')), args.K)

for split in ['train', 'val']:
    for cls in tqdm(class_names):
        shutil.copytree(os.path.join(args.src_dir, split, cls), os.path.join(args.dst_dir, split, cls), dirs_exist_ok=True)
    print(f'### Created imagenet-{args.K} {split} ###')

Answer 2 · 2023-05-11T09:19:50.000Z

Excuse me, does the line23 mean that the validation set of randomly selected imagenet-100 is the corresponding 100 classes of selected train set, which means the validation set of Imagenet has been already classified? Thanks!

Answer 3 · 2023-05-12T02:20:45.000Z

Hi! Here the in-distribution validation set is selected to measure the ID classification performance, which needs to share the same set of classes as the training set.

Answer 4 · 2023-05-12T08:23:36.000Z

Got it! And could you provide the script of evaluating ImageNet100? Such as what scores you use(KNN or Maha) and the K value. Thanks a lot!

Answer 5 · 2023-05-18T08:34:54.000Z

Hi! The same script (eval_ood.py) can be used for evaluating ImageNet100, just specify --in_dataset as 'ImageNet-100' (and the corresponding hyperparameters used for finetuning ImageNet-100). We use KNN as default score as shown in the paper but Maha also works well.