/posenet-pytorch

PyTorch implementation of PoseNet

Primary LanguagePythonOtherNOASSERTION

Posenet implementation in PyTorch

This is an ongoing PyTorch implementation for PoseNet, developed based on Pix2Pix code.

Prerequisites

  • Linux
  • Python 3.5.2
  • CPU or NVIDIA GPU + CUDA CuDNN

Getting Started

Installation

  • Install PyTorch and dependencies from http://pytorch.org
  • Install Torch vision from the source.
  • Clone this repo:
git clone https://github.com/hazirbas/posenet-pytorch
cd posenet-pytorch
pip install -r requirements.txt

PoseNet train/test

  • Download a Cambridge Landscape dataset (e.g. KingsCollege) under datasets/ folder.
  • Compute image mean
python util/compute_image_mean.py --dataroot datasets/KingsCollege --height 256 --width 455 --save_resized_imgs
  • Train a model:
python train.py --dataroot ./datasets/KingsCollege --name posenet/KingsCollege/beta500 --beta 500 --gpu 0
  • To view training errors and loss plots, set --display_id 1, run python -m visdom.server and click the URL http://localhost:8097. Checkpoints are saved under ./checkpoints/posenet/KingsCollege/beta500/.
  • Test the model:
python test.py --dataroot ./datasets/KingsCollege --name posenet/KingsCollege/beta500 --gpu 0

The test errors will be saved to a text file under ./results/posenet/KingsCollege/beta500/.

Initialize model with pretrained googlenet on Places dataset

If you would like to initialize the model with pretrained weights, download the places-googlenet.pickle file under pretrained_models/ folder:

wget https://vision.in.tum.de/webarchive/hazirbas/posenet-pytorch/places-googlenet.pickle

Optimization scheme and loss weights

We use the training scheme defined in PoseLSTM. Best models are determined by the median error wrt position.

Dataset beta PoseNet Ours Model
King's College 500 1.92m 5.40° 1.34m 4.33° epoch495
Old Hospital 1500 2.31m 5.38° 2.58m 5.77° epoch455
Shop Façade 100 1.46m 8.08° 1.44m 8.26° epoch470
St Mary's Church 250 2.65m 8.48° 2.40m 9.56° epoch470

Citation

@inproceedings{PoseNet15,
  title={PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization},
  author={Alex Kendall, Matthew Grimes and Roberto Cipolla },
  journal={ICCV},
  year={2015}
}
@inproceedings{PoseLSTM17,
  author = {Florian Walch and Caner Hazirbas and Laura Leal-Taixé and Torsten Sattler and Sebastian Hilsenbeck and Daniel Cremers},
  title = {Image-based localization using LSTMs for structured feature correlation},
  month = {October},
  year = {2017},
  booktitle = {ICCV},
  eprint = {1611.07890},
  url = {https://github.com/NavVisResearch/NavVis-Indoor-Dataset},
}

Acknowledgments

Code is inspired by pytorch-CycleGAN-and-pix2pix.