3D/Volumetric Convolutional Neural Networks with Theano+Lasagne.
voxnet
is based on Theano
and Lasagne.
You will also need path.py and scikit-learn. Scikit-learn is used purely for evaluation of accuracy and is an easily removable dependency.
You can do something like
git clone git@github.com:dimatura/voxnet.git
cd voxnet
pip install --editable .
In this example we will use the ModelNet 10 dataset, from the excellent 3D ShapeNet project.
To make our life easier we will use the voxelized version, which
is included in the source code distribution. Unfortunately,
it comes in evil .mat
files, so we will convert them to a
more python-friendly data format first.
scripts/download_shapenet10.sh
will try to download and convert the data
for you. This may take a while.
# scripts/download_shapenet10.sh
wget http://3dshapenets.cs.princeton.edu/3DShapeNetsCode.zip
unzip 3DShapeNetsCode
python convert_shapenet10.py 3DShapeNets
If you're curious, the data format is simply a tar
file
consisting of zlib-compressed .npy
files. Simple and effective!
We will be messy and do everything in the scripts/
directory.
cd scripts/
python train.py config/shapenet10.py shapenet10_train.tar
config/shapenet10.py
stores the model architecture
and hyperparameters related to training as Python code.
train.py
loads this code dynamically, compiles the Theano model,
and begins training with the data from shapenet10_train.tar
.
Note that compiling the Theano model might around a minute
for the first execution.
As soon as training begins, metrics will be printed to stdout and
learned weights are periodically saved to weights.npz
.
During training (which will take a few hours) you can monitor progress
visually by by running scripts/train_reports.py
. Note that this script has a
few dependencies, including seaborn
and pandas
. The script uses training
metrics stored in a file called metrics.jsonl
. This is simply a format with
one json
record per line (inspired by JSON Lines).
python test.py config/shapenet10.py shapenet10_test.tar --out-fname out.npz
test.py
uses the same model as train.py
, but only
for classifying instances from the test set. It performs
simple evaluation and optionally saves the predictions in an
.npz
file.
If you don't want to train your own, you can use the
--weights
option with the
an example result of running this script,
shapenet10_weights.npz
. This file was committed with
Git LFS, so you
can use that, or simply download the raw version
from github.
If you have CPU cycles to burn, try
python output_viz.py out.npz shapenet10_test.tar out.html
This will randomly select 10 instances from the test set,
render them with a very very inefficient renderer, and create
a small page called out.html
with the renders,
the ground truth and the predicted label (see example above).
Requires gizeh.
@inproceedings{maturana_iros_2015,
author = "Maturana, D. and Scherer, S.",
title = "{VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition}",
booktitle = "{IROS}",
year = "2015",
pdf = "/extra/voxnet_maturana_scherer_iros15.pdf",
}
- Cleaning up
- Testing
- More options
- Better visualization
- Use new cudnn 3D capabilities