SupportNet: a novel incremental learning framework through deep learning and support data
This repository shows the implementation of SupportNet, solving the catastrophic forgetting problem efficiently and effectively. A plain well-trained deep learning model often does not have the ability to learn new knowledge without forgetting the previously learned knowledge, which is known as catastrophic forgetting. Here we propose a novel method, SupportNet, to efficiently and effectively solve the catastrophic forgetting problem in the class incremental learning scenario. SupportNet combines the strength of deep learning and support vector machine (SVM), where SVM is used to identify the support data from the old data, which are fed to the deep learning model together with the new data for further training so that the model can review the essential information of the old data when learning the new information. Two powerful consolidation regularizers are applied to stabilize the learned representation and ensure the robustness of the learned model. We validate our method both theoretically and also empirically with comprehensive experiments on various tasks, which shows that SupportNet drastically outperforms the state-of-the-art incremental learning methods and even reaches similar performance as the deep learning model trained from scratch on both old and new data.
https://arxiv.org/abs/1806.02942
- MNIST
- CIFAR-10 and CIFAR-100
- The EC dataset: http://www.cbrc.kaust.edu.sa/DEEPre/dataset.html (This file contains the orignal sequence data and the labels. The pickle files, which are preprocessed from this original sequence data and are the feature files ready for usage in the script can be provided based on request. We are sorry that we cannot completely release them currently, since this paper has not been officiaully published. Those feature files would be released after this paper been published.)
- The HeLa dataset: http://murphylab.web.cmu.edu/data/2DHeLa
- The BreakHis dataset: https://web.inf.ufpr.br/vri/breast-cancer-database/
- Tensorflow (https://www.tensorflow.org/)
- TFLearn (tflearn.org/)
- CUDA (https://developer.nvidia.com/cuda-downloads)
- cuDNN (https://developer.nvidia.com/cudnn)
- sklearn (scikit-learn.org/)
- numpy (www.numpy.org/)
- Jupyter notebook (jupyter.org/)
The code is in folder src_ec. The whole program can be run by execute main.sh. That file could take advantage of supportnet.py, which is the complete implementation of SupportNet. icarl_level_1.py shows our implementation of iCaRL on this specific dataset. The other files are some temp files for testing or libraries.
The experimental results were recorded in level_1_result.md.
The code the result are in the submodule myIL. It's written using Jupyter Notebook. Every code and result were thus recorded.
Illustration of class incremental learning. After we train a base model using all the available data at a certain time point (e.g., classes
Overview of our framework. The basic idea is to incrementally train a deep learning model efficiently using the new class data and the support data of the old classes. We divide the deep learning model into two parts, the mapping function (all the layers before the last layer) and the softmax layer (the last layer). Using the learned representation produced by the mapping function, we train an SVM, with which we can find the support vector index and thus the support data of old classes. To stabilize the learned representation of old data, we apply two novel consolidation regularizers to the network.
Main results. (A)-(E): Performance comparison between SupportNet and five competing methods on the five datasets in terms of accuracy. For the SupportNet and iCaRL methods, we set the support data (examplar) size as 2000 for MNIST, CIFAR-10 and enzyme data, 80 for the HeLa dataset, and 1600 for the breast tumor dataset. (F): The accuracy deviation of SupportNet from the ''All Data'' method with respect to the size of the support data. The x-axis shows the support data size. The y-axis is the test accuracy deviation of SupportNet from the ''All Data'' method after incrementally learning all the classes of the HeLa subcellular structure dataset.
Notice that for the MNIST data, we can reach almost the same performance as using all data during each incremental learning iteration. That is, with our framework, we only need 2000 data points to reach the same performance level as using 50,000 data points on that specific data.