The SLIDE package contains the source code for reproducing the main experiments in this paper.
The Datasets can be downloaded in Amazon-670K.
We suggest directly get TensorFlow docker image to install TensorFlow-GPU. For TensorFlow-CPU compiled with AVX2, we recommend using this precompiled build.
Also there is a TensorFlow docker image specifically built for CPUs with AVX-512 instructions, to get it use:
docker pull clearlinux/stacks-dlrs_2-mkl
config.py
controls the parameters of TensorFlow training like learning rate
. example_full_softmax.py, example_sampled_softmax.py
are example files for Amazon-670K
dataset with full softmax and sampled softmax respectively.
Run
python python_examples/example_full_softmax.py
python python_examples/example_sampled_softmax.py
For simplicity, please refer to the our Docker image with all environments installed. To replicate the experiment without setting Hugepages, please download Amazon-670K in path /home/code/HashingDeepLearning/dataset/Amazon
Firstly, CNPY package needs to be installed.
Additionally, Transparent Huge Pages must be enabled. SLIDE requires approximately 900 2MB pages, and 10 1GB pages.
Please see the Instructions to enable Hugepages on Ubuntu.
Also, note that only Skylake or newer architectures support Hugepages. For older Haswell processors, we need to remove the flag -mavx512f
from the OPT_FLAGS
line in Makefile. You can also revert to the commit 2d10d46b5f6f1eda5d19f27038a596446fc17cee
to ignore the HugePages optmization and still use SLIDE (which could lead to a 30% slower performance).
Run
make
./runme Config_amz.csv
Note that Makefile
needs to be modified based on the CNPY path. Also the trainData, testData, logFile
in Config_amz.csv needs to be changed accordingly too.