This is an ongoing PyTorch implementation for PoseNet, developed based on Pix2Pix code.
- Linux
- Python 3.5.2
- CPU or NVIDIA GPU + CUDA CuDNN
- Install PyTorch and dependencies from http://pytorch.org
- Install Torch vision from the source.
- Clone this repo:
git clone https://github.com/hazirbas/posenet-pytorch
cd posenet-pytorch
pip install -r requirements.txt
- Download a Cambridge Landscape dataset (e.g. KingsCollege) under datasets/ folder.
- Compute image mean
python util/compute_image_mean.py --dataroot datasets/KingsCollege --height 256 --width 455 --save_resized_imgs
- Train a model:
python train.py --dataroot ./datasets/KingsCollege --name posenet/KingsCollege/beta500 --beta 500 --gpu 0
- To view training errors and loss plots, set
--display_id 1
, runpython -m visdom.server
and click the URL http://localhost:8097. Checkpoints are saved under./checkpoints/posenet/KingsCollege/beta500/
. - Test the model:
python test.py --dataroot ./datasets/KingsCollege --name posenet/KingsCollege/beta500 --gpu 0
The test errors will be saved to a text file under ./results/posenet/KingsCollege/beta500/
.
If you would like to initialize the model with pretrained weights, download the places-googlenet.pickle file under pretrained_models/ folder:
wget https://vision.in.tum.de/webarchive/hazirbas/posenet-pytorch/places-googlenet.pickle
We use the training scheme defined in PoseLSTM. Best models are determined by the median error wrt position.
Dataset | beta | PoseNet | Ours | Model |
---|---|---|---|---|
King's College | 500 | 1.92m 5.40° | 1.34m 4.33° | epoch495 |
Old Hospital | 1500 | 2.31m 5.38° | 2.58m 5.77° | epoch455 |
Shop Façade | 100 | 1.46m 8.08° | 1.44m 8.26° | epoch470 |
St Mary's Church | 250 | 2.65m 8.48° | 2.40m 9.56° | epoch470 |
@inproceedings{PoseNet15,
title={PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization},
author={Alex Kendall, Matthew Grimes and Roberto Cipolla },
journal={ICCV},
year={2015}
}
@inproceedings{PoseLSTM17,
author = {Florian Walch and Caner Hazirbas and Laura Leal-Taixé and Torsten Sattler and Sebastian Hilsenbeck and Daniel Cremers},
title = {Image-based localization using LSTMs for structured feature correlation},
month = {October},
year = {2017},
booktitle = {ICCV},
eprint = {1611.07890},
url = {https://github.com/NavVisResearch/NavVis-Indoor-Dataset},
}
Code is inspired by pytorch-CycleGAN-and-pix2pix.