By Zexi Han.
National Laboratory of Pattern Recognition (NLPR) at the Chinese Academy of Sciences (CAS).
caffe-SSDH updates the caffe version of Caffe-DeepBinaryCode. Now it supports the experiments with the latest batch normalization and Deep Residual Networks.
We made tests on the dataset from Alibaba Large-scale Image Search Challenge with SSDH approach, which contains 5 million product images, but the performance was not as good as Kevin's experiment on CIFAR-10 dataset. It should have worked better theoretically. In the future work, I will make improvements on the original SSDH and update it here.
-
Introduction of Alibaba Large-scale Image Search Challenge dataset:
Train set |Evaluation set| Query set
:------------:|:------------:|:------------: 1,950,998 | 3,195,334 | 1,417+3,567 -
The dataset is classified into 10 high-level concepts and 676 sub concepts:
Tops | Skirts | Trousers and shorts | Bags | Shoes | Accessories | Snacks | Makeup | Drinks | Furnishing
Please feel free to leave suggestions or comments to Zexi Han (zexihan@me.com).
Supervised Learning of Semantics-Preserving Deep Hashing (SSDH)
Created by Kevin Lin, Huei-Fang Yang, and Chu-Song Chen at Academia Sinica, Taipei, Taiwan.
We present a simple yet effective supervised deep hash approach that constructs binary hash codes from labeled data for large-scale image search. SSDH constructs hash functions as a latent layer in a deep network and the binary codes are learned by minimizing an objective function defined over classification error and other desirable hash codes properties. Compared to state-of-the-art results, SSDH achieves 26.30% (89.68% vs. 63.38%), 17.11% (89.00% vs. 71.89%) and 19.56% (31.28% vs. 11.72%) higher precisions averaged over a different number of top returned images for the CIFAR-10, NUS-WIDE, and SUN397 datasets, respectively.
The details can be found in the following arXiv preprint. Presentation slide can be found here
If you find our work useful in your research, please consider citing:
Supervised Learning of Semantics-Preserving Hashing via Deep Neural Networks for Large-Scale Image Search
Huei-Fang Yang, Kevin Lin, Chu-Song Chen
arXiv preprint arXiv:1507.00101
- MATLAB (tested with 2012b on 64-bit Linux)
- Caffe's prerequisites
Adjust Makefile.config and simply run the following commands:
$ make all -j8
$ make test -j8
$ make matcaffe
$ ./prepare.sh
For a faster build, compile in parallel by doing make all -j8
where 8 is the number of parallel threads for compilation (a good choice for the number of threads is the number of cores in your machine).
Launch matlab and run demo.m
. This demo will generate 48-bits binary codes for each image using the proposed SSDH.
>> demo
Launch matalb and run run_cifar10.m
to perform the evaluation of precision at k
and mean average precision at k
. We set k=1000
in the experiments. The bit length of binary codes is 48
. This process takes around 12 minutes.
>> run_cifar10
Then, you will get the mAP
result as follows.
>> MAP = 0.899731
Moreover, simply run the following commands to generate the precision at k
curves:
$ cd analysis
$ gnuplot plot-p-at-k.gnuplot
You will reproduce the precision curves with respect to different number of top retrieved samples when the 48-bit hash codes are used in the evaluation.
Simply run the following command to train SSDH:
$ ./examples/SSDH/train.sh
After 50,000 iterations, the top-1 error is 9.7% on the test set of CIFAR10 dataset:
I0107 19:24:32.258903 23945 solver.cpp:326] Iteration 50000, loss = 0.0274982
I0107 19:24:32.259012 23945 solver.cpp:346] Iteration 50000, Testing net (#0)
I0107 19:24:36.696506 23945 solver.cpp:414] Test net output #0: accuracy = 0.903125
I0107 19:24:36.696543 23945 solver.cpp:414] Test net output #1: loss: 50%-fire-rate = 1.47562e-06 (* 1 = 1.47562e-06 loss)
I0107 19:24:36.696552 23945 solver.cpp:414] Test net output #2: loss: classfication-error = 0.332657 (* 1 = 0.332657 loss)
I0107 19:24:36.696559 23945 solver.cpp:414] Test net output #3: loss: forcing-binary = -0.00317774 (* 1 = -0.00317774 loss)
I0107 19:24:36.696565 23945 solver.cpp:331] Optimization Done.
I0107 19:24:36.696570 23945 caffe.cpp:214] Optimization Done.
The training process takes roughly 2~3 hours on a desktop with Titian X GPU. You will finally get your model named SSDH48_iter_xxxxxx.caffemodel
under folder /examples/SSDH/
To use the model, modify the model_file
in demo.m
to link to your model:
model_file = './YOUR/MODEL/PATH/filename.caffemodel';
Launch matlab, run demo.m
and enjoy!
>> demo
It should be easy to train the model using another dataset as long as that dataset has label annotations.
- Convert your training/test set into leveldb/lmdb format using
create_imagenet.sh
. - Modify the
source
in/example/SSDH/train_val.prototxt
to link to your training/test set. - Run
./examples/SSDH/train.sh
, and start training on your dataset.
Note: This documentation may contain links to third party websites, which are provided for your convenience only. Third party websites may be subject to the third party’s terms, conditions, and privacy statements.
If ./prepare.sh
fails to download data, you may manually download the resouces from:
Please feel free to leave suggestions or comments to Kevin Lin (kevinlin311.tw@iis.sinica.edu.tw), Huei-Fang Yang (hfyang@citi.sinica.edu.tw) or Chu-Song Chen (song@iis.sinica.edu.tw)