wvangansbeke/Unsupervised-Classification

Purpose of Memory Bank

HenryPengZou opened this issue · 2 comments

Hi, thanks for your great work!

I have a question regarding the usage of the memory bank when mining nearest neighbors in the following lines:

# Memory Bank
print(colored('Build MemoryBank', 'blue'))
base_dataset = get_train_dataset(p, val_transforms, split='train') # Dataset w/o augs for knn eval
base_dataloader = get_val_dataloader(p, base_dataset)
memory_bank_base = MemoryBank(len(base_dataset),
p['model_kwargs']['features_dim'],
p['num_classes'], p['criterion_kwargs']['temperature'])
memory_bank_base.cuda()
memory_bank_val = MemoryBank(len(val_dataset),
p['model_kwargs']['features_dim'],
p['num_classes'], p['criterion_kwargs']['temperature'])
memory_bank_val.cuda()

Can't we directly mine nearest neighbors from base_dataset and val_dataset? Why do we bother to use the memory bank?

Thanks a lot for your help in advance.

Hi @HenryPengZou,

The memory bank is used to store the features for all images and subsequently mine the nearest neighbors for each image (see MemoryBank.update method. The base_dataset is used to adopt a different set of augmentations (i.e. validation set augmentations).

I see, thank you~