A soft nearest-neighbor framework for continual semi-supervised learning

Zhiqi Kang* Enrico Fini* Moin Nabi Elisa Ricci Karteek Alahari

Oral presentation

Official PyTorch implementation of our ICCV 2023 paper "A soft nearest-neighbor framework for continual semi-supervised learning".

Our proposed NNCSL（Nearest-Neighbor for Continual Semi-supervised Learning) is composed of a continual semi-supervised learner CSL and a distillation loss NND.

CSL

is our base continual semi-supervised learner, which is developed from PAWS and adapted to a novel continual semi-supervised scenario. An illustration of the architecture is as follows:

NND

is our propsed distillation strategy based on the nearest-neighbor classifier that transfers both class-level and feature-level knowledge. An illustration of the architecture is as follows:

How to run NNCSL?

Build the conda environment

Our implementation does not requires many special packages, but please make sure that the following requirements are satisfied in your environment:

Python 3.8
PyTorch install 1.7.1
torchvision
CUDA 11.0
Apex with CUDA extension
Other dependencies: PyYaml, numpy, opencv, submitit

Download dataset

For CIFAR-10 and CIFAR-100, the datasets can be auto-downloaded by torchvision. For ImageNet-100, please download the dataset and make sure that the images are organised as follows:

--configs  
--src  
--datasets  
  |--imagenet100  
      |--train  
           |--n01330764  
           |--...  
      |--val  
           |--n01330764  
           |--...

It is also possible to change the organization of the images and the path to the datasets. Please refer to the config files for more details

Run the scripts

Once the dataset is ready, the experiment can be launched by the following commands:

CIFAR-10

for CIFAR-10, 0.8% of labeled data, buffer size 500, using our NNCSL

python main.py --sel nncsl_train  --fname configs/nncsl/cifar10/cifar10_0.8%_buffer500_nncsl.yaml

for CIFAR-10, 0.8% of labeled data, buffer size 500, using CSL

python main.py --sel nncsl_train  --fname configs/nncsl/cifar10/cifar10_0.8%_buffer500_csl.yaml

for CIFAR-10, 0.8% of labeled data, buffer size 500, using PAWS

python main.py --sel nncsl_train  --fname configs/nncsl/cifar10/cifar10_0.8%_buffer500_paws.yaml

CIFAR-100

for CIFAR-100, 0.8% of labeled data, buffer size 500, using our NNCSL

python main.py --sel nncsl_train  --fname configs/nncsl/cifar100/cifar100_0.8%_buffer500_nncsl.yaml

ImageNet-100

for ImageNet-100, 1% of labeled data, buffer size 500, using our NNCSL

python main.py --sel nncsl_train  --fname configs/nncsl/imagenet100/imgnt100_1%_buffer500_nncsl.yaml

Change buffer size

one can easily change the buffer size by modifying the parameter buffer_size in the config file, we provide one example as: for CIFAR-100, 0.8% of labeled data, buffer size 5120, using our NNCSL

python main.py --sel nncsl_train  --fname configs/nncsl/cifar100/cifar100_0.8%_buffer5120_nncsl.yaml

Please check the config files in ./configs/nncsl/ for more different settings.

Change labeled data proportion

one can try with different ratio of labeled data other than the given settings (0.8%, 1%, 5% and 25%). There are two steps:

Generate the index files with make_subset.py in \subsets. Below is an example for generate index files for CIFAR-10 with 50 percents of labeled data.
```
 python make_subset.py --dataset cifar10 --seed 0 --percent 50
```
Source the index files in the corresponding config files Change subset_path and subset_path_cls to the corresponding path. Please note that these two parameters should be consistent.

Work in progress

We will keep updating our repository to make it easier to use and to share our recent progress on this project.

Using standard reservoir replay buffer

Acknolegment

This implementation is developed based on the source code of PAWS.

CITATION

If you find our codes or paper useful, please consider giving us a star or cite with:

@InProceedings{Kang_2023_ICCV,
    author    = {Kang, Zhiqi and Fini, Enrico and Nabi, Moin and Ricci, Elisa and Alahari, Karteek},
    title     = {A Soft Nearest-Neighbor Framework for Continual Semi-Supervised Learning},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {11868-11877}
}

KxuanZhang/NNCSL-ICCV2023