
Implementation of CVPR'19 paper (oral) - ContextDesc: Local Descriptor Augmentation with Cross-Modality Context

ContextDesc implementation


TensorFlow implementation of ContextDesc for CVPR'19 paper (oral) "ContextDesc: Local Descriptor Augmentation with Cross-Modality Context", by Zixin Luo, Tianwei Shen, Lei Zhou, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang and Long Quan.

This paper focuses on augmenting off-the-shelf local feature descriptors with two types of context: the visual context from high-level image representation, and geometric context from keypoint distribution. If you find this project useful, please cite:

  title={ContextDesc: Local Descriptor Augmentation with Cross-Modality Context},
  author={Luo, Zixin and Shen, Tianwei and Zhou, Lei and Zhang, Jiahui and Yao, Yao and Li, Shiwei and Fang, Tian and Quan, Long},
  journal={Computer Vision and Pattern Recognition (CVPR)},


Please use Python 3.6, install NumPy, OpenCV (3.4.2), OpenCV-Contrib (3.4.2) and TensorFlow (1.14.0). Refer to requirements.txt for some other dependencies.

Pre-trained model

We provide both the Protobuf files (for a quick start) and checkpoint files (for research purposes) for restoring the pre-trained weights, including

        ├── loc.pb (weights of the local feature model and matchability predictor)
        ├── aug.pb (weights of the augmentation model) 
        └── model.ckpt-* (checkpoint files for both the local feature model and agumentation model)

Several variants of ContextDesc as in the paper are provided for study.

Name Downloads Descriptions
Retrieval model Link (Regional feature) An image retrieval model trained on Google-Landmarks Dataset that provides high-level image representation to enrich visual context. More details can be found in the supplementary material.
ContextDesc Link (Base) Use GeoDesc [1] (ECCV'18) as the local feature model, and train only the augmentation model.
ContextDesc+ Link (Better) Train the local feature model and augmentation model separately with the proposed scale-aware N-pair loss.
ContextDesc++ Link (Best) End-to-end train both the local feature and augmentation models.
Dense-ContextDesc Link Densely extract features from the entire input image (instead of image patch). Details can be found here.

The TensorFlow network definition can be found here. An usage is provided along with the image matching example.

Training data

Part of the training data is released in GL3D. Please also cite MIRorR [2] if you find this dataset useful for your research.

Example scripts

1. Test image matching

To get started, clone the repo and download the pretrained model (take ContextDesc++ as an example):

git clone https://github.com/lzx551402/contextdesc.git && \
cd /local/contextdesc/pretrained && \
wget https://research.altizure.com/data/contextdesc_models/contextdesc_pp.tar && \
tar -xvf contextdesc_pp.tar

then simply call:

cd /local/contextdesc && python image_matching.py

The matching results from SIFT features (top), raw local features (middle) and augmented features (bottom) will be displayed.

  • To test the performance of a dense model, call the script with --dense_desc.
  • To use the TensorFlow checkpoint file for parameter restoring, call the script with --type ckpt.
  • Type python image_matching.py --h to view more options and test on your own images.

2. (TODO) Evaluation on HPatches Sequences

Download HPSequences (full image sequences of HPatches [3] and their corresponding homographies).

3. Benchmark on image matching

ContextDesc, together with a learned matcher [4], won both the stereo and muti-view image matching tracks at IMW2019. We provide the script that prepares the ContextDesc features and formats the submission files to this challenge.

To get started, follow the challenge instructions to download the test data.

Next, configure the data paths (data_root, dump_root and submission_root) in configs/imw2019_eval.yaml.

Then call the evaluation script by:

cd /local/contextdesc && python evaluations.py --config configs/imw2019_eval.yaml

You may then compress and submit the results to the challenge website.

4. Benchmark on visual localization

ContextDesc also achieved competitive results on visual localization benchmark. Please download Aachen Day-Night dataset and follow the evaluation instructions to prepare the evaluation data.

Next, configure the data paths (data_root and dump_root) in configs/aachen_eval.yaml

Then extract the features by:

cd /local/contextdesc && python evaluations.py --config configs/aachen_eval.yaml

The extracted features will be saved alongside their corresponding images, e.g., the features for image /local/Aachen_Day-Night/images/images_upright/db/1000.jpg will be in the file /local/Aachen_Day-Night/images/image_upright/db/1000.jpg.contextdesc10k_upright (the method name here is contextdesc10k_upright).

Finally, refer to the evaluation script to generate and submit the results to the challenge website.


  • Add TensorFlow network definition.


  • A major code refactorying.
  • Add evaluation instructions on image matching and visual localization benchmark.
  • Add experimental Dense-ContextDesc model.