This code implements the subtask partitioning algorithm described by the HydraNets paper (Mullapudi, et al). It was originally designed to partition the ImageNet dataset.
To set up the virtual environment, run:
$ python3 -m venv <path/to/virtualenv>
$ source <path/to/virtualenv/>/bin/activate
At a high level, the HydraNets algorithm to partition a superset of size C
into n
subsets is as follows:
- For each class, compute a feature representation by taking the average of the features from the final fully connected layer of an image classification network for several images of that class.
- Use k-means to find
n
centroids forn
clusters. - For each of the
n
centroids, assign theC / n
closest classes to it (to handle the generalized case whereC mod n != 0
, just assign the remaining amount of classes if that amount is less thanC/ n
).
The paper covers the algorithm in its entirety in Section 3.1.
If your local repository already has a virtual environment, you skip the creation step and just activate that one.
After setting up your virtual environment, install the required dependencies by running:
$ pip3 install -r requirements.txt
Make sure your file structure follows this general format:
train
class1
class2
...
classN
$ python3 partition.py