Assembling Jigsaw Puzzles Efficiently
Reimplementing the 2016 paper "Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles" with newer and more efficient architectures.
Original implementation of paper (with AlexNet) based off of this repository.
- Python 3
- Pytorch
- Tensorflow for logging and training visualization
Getting the Data
Note: Unfortunately, ImageNet was down for a bit, so I ended up using a torrent and downloading the 2012 Object Detection Training Dataset (full ImageNet was just too large to fit on my disk).
- Download the ILSVRC 2012 Object Detection Academic Torrent (1,281,168 images across 1000 imagenet classes)
- Then download ILSVRC2012_img_train.tar using your favorite torrent client.
- Extract with
tar -xvf ILSVRC2012_img_train.tar
- Extract all sub tars into their own folders with
for f in n*.tar; do mkdir "${f%.tar}"; tar -xf "$f" -C "${f%.tar}"; done
- Move the new folders into {repository root}/imagenet/all and run
. Data is now ready!
Training the Network
Fill the path information in IMAGENET_FOLD needs to point to the folder containing ImageNet.
./ [GPU_ID]
or call the python script
python [*path_to_imagenet*] --checkpoint [*path_checkpoints_and_logs*] --gpu [*GPU_ID*] --batch [*batch_size*]
By default the network uses 1000 permutations with maximum hamming distance selected using
To change the file name loaded for the permutations, open the file Dataset/ and change the permutation file in the method retrieve_permutations
- The input of the network should be 64x64, but it is resized to 75x75, otherwise the output of conv5 is 2x2 instead of 3x3 like the official architecture
- Jigsaw trained using the approach of the paper: SGD, LRN layers, 70 epochs
- Implemented data augmentation to discourage learning shortcuts: spatial jittering, normalize each patch indipendently, color jittering, 30% black&white image
Results pending...