/snap

SNAP: Self-supervised Neural Maps for Visual Positioning and Semantic Understanding (NeurIPS 2023)

Primary LanguagePythonApache License 2.0Apache-2.0

SNAP!
Self-Supervised Neural Maps
for Visual Positioning and Semantic Understanding

Paul-Edouard Sarlin · Eduard Trulls
Marc Pollefeys · Jan Hosang · Simon Lynen

teaser
SNAP estimates 2D neural maps from multi-modal data like StreetView and aeral imagery.
Neural maps learn easily interpretable, high-level semantics through self-supervision alone
and can be used for geometric and semantic tasks.

This repository hosts the training and inference code for SNAP, a deep neural network that turns multi-modal imagery into rich 2D neural maps. SNAP was trained on a large dataset of 50M StreetView images with associated camera poses and aerial views. We do not release this dataset and the trained models, so this code is provided solely as a reference and cannot be used as is to reproduce any result of the paper.

Usage

The project requires Python >= 3.10 and is based on Jax and Scenic. All dependencies are listed in requirements.txt.

  • The data is stored as TensorFlow dataset and loaded in snap/data/loader.py.
  • Train SNAP with self-supervision:
python -m snap.train --config=snap/configs/train_localization.py \
    --config.batch_size=32 \
    --workdir=train_snap_sv+aerial
  • Evaluate SNAP for visual positioning:
python -m snap.evaluate --config=snap/configs/eval_localization.py \
    --config.workdir=train_snap_sv+aerial \
    --workdir=.  # unused
  • Fine-tune SNAP for semantic mapping:
python -m snap.train --config=snap/configs/train_semantics.py \
    --config.batch_size=32 \
    --config.model.bev_mapper.pretrained_path=train_snap_sv+aerial \
    --workdir=train_snap_sv+aerial_semantics
  • Evaluate the semantic mapping:
python -m snap.evaluate --config=snap/configs/eval_semantics.py \
    --config.workdir=train_snap_sv+aerial_semantics \
    --workdir=.  # unused

BibTeX citation

If you use any ideas from the paper or code from this repo, please consider citing:

@inproceedings{sarlin2023snap,
  author    = {Paul-Edouard Sarlin and
               Eduard Trulls and
               Marc Pollefeys and
               Jan Hosang and
               Simon Lynen},
  title     = {{SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding}},
  booktitle = {NeurIPS},
  year      = {2023}
}

This is not an officially supported Google product.