CapsNet-Tensorflow
A Tensorflow implementation of CapsNet based on Geoffrey Hinton's paper Dynamic Routing Between Capsules
Notes:
- The current version supports the MNIST and Fashion-MNIST datasets. The current test accuracy for MNIST is
99.57%
, and Fashion-MNIST90.60%
, see details in theResults
section- See dist_version for multi-GPU support
- Here(知乎) is an article explaining my understanding of the paper. It may be helpful in understanding the code.
Important:
If you need to apply CapsNet model to your own datasets or build up a new model with the basic block of CapsNet, please follow my new project CapsLayer, which is an advanced library for capsule theory, aiming to integrate capsule-relevant technologies, provide relevant analysis tools, develop related application examples, and promote the development of capsule theory. For example, you can use capsule layer block in your code easily with the API
capsLayer.layers.fully_connected
andcapsLayer.layers.conv2d
Requirements
- Python
- NumPy
- Tensorflow (I'm using 1.3.0, not yet tested for older version)
- tqdm (for displaying training progress info)
- scipy (for saving images)
Usage
Step 1. Download this repository with git
or click the download ZIP button.
$ git clone https://github.com/naturomics/CapsNet-Tensorflow.git
$ cd CapsNet-Tensorflow
Step 2. Download the MNIST or Fashion-MNIST dataset. In this step, you have two choices:
- a) Automatic downloading with
download_data.py
script
$ python download_data.py (for mnist dataset)
$ python download_data.py --dataset fashion-mnist --save_to data/fashion (for fashion-mnist dataset)
- b) Manual downloading with
wget
or other tools, move and extract dataset intodata/mnist
ordata/fashion-mnist
directory, for example:
$ mkdir -p data/mnist
$ wget -c -P data/mnist http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
$ wget -c -P data/mnist http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
$ wget -c -P data/mnist http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
$ wget -c -P data/mnist http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
$ gunzip data/mnist/*.gz
Step 3. Start the training(Using the MNIST dataset by default):
$ python main.py
$ # or training for fashion-mnist dataset
$ python main.py --dataset fashion-mnist
Step 4. Calculate test accuracy
$ python main.py --is_training=False
$ # for fashion-mnist dataset
$ python main.py --dataset fashion-mnist --is_training=False
Note: The default parameters of batch size is 128, and epoch 50. You may need to modify the
config.py
file or use command line parameters to suit your case, e.g. set batch size to 64 and do once test summary every 200 steps:python main.py --test_sum_freq=200 --batch_size=48
Results
- training loss
- test accuracy(using reconstruction)
Routing iteration | 1 | 2 | 3 |
---|---|---|---|
Test accuracy | 0.43 | 0.44 | 0.49 |
Paper | 0.29 | - | 0.25 |
My simple comments for capsule
- A new version neural unit(vector in vector out, not scalar in scalar out)
- The routing algorithm is similar to attention mechanism
- Anyway, a great potential work, a lot to be built upon
My weChat:
Reference
- XifengGuo/CapsNet-Keras: referred for some code optimizations