/image-matching-benchmark-baselines

Baselines for the Image Matching Benchmark and Challenge

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Summary

This repository contains utilities to extract local features for the Image Matching Benchmark and its associated challenge. For details please refer to the website.

Data

Data can be downloaded here: you may want to download the images for validation and testing. Most of the scripts assume that the images are in ../imw-2020, as follows:

$ ~/image-matching-benchmark-baselines $ ls ../imw-2020/
british_museum           lincoln_memorial_statue  milan_cathedral  piazza_san_marco  sacre_coeur      st_pauls_cathedral  united_states_capitol
florence_cathedral_side  london_bridge            mount_rushmore   reichstag         sagrada_familia  st_peters_square

$ ~/image-matching-benchmark-baselines $ ls ../imw-2020/british_museum/
00350405_2611802704.jpg  26237164_4796395587.jpg  45839934_4117745134.jpg  [...]

You may need to format the validation set in this way.

Installation

Initialize the submodules by running the following:

git submodule update --init

We provide support for the following methods:

We have pre-packaged conda environments: see below for details. You can install miniconda following these instructions (we have had problems with the latest version -- consider an older one). You can install an environment with:

conda env create -f system/<environment>.yml

And switch between them with:

conda deactivate
conda activate <environment>

Patch-based descriptors

Pre-extracting patches for patch-based descriptors

Many learned descriptors require pre-generated patches. This functionality is useful by itself, so we moved it to a separate package. You can install it with pip install extract_patches: please note that this requires python 3.6, as the package is generated via nbdev). You may do do this with the system/r2d2-python3.6.yml environment (which also requires 3.6 due to formatted string literals) or create a different environment.

To extract patches with the default configuration to ../benchmark-patches-8k, run:

python detect_sift_keypoints_and_extract_patches.py

This will create the following HDF5 files:

$ stat -c "%s %n" ../benchmark-patches-8k/british_museum/*
6414352 ../benchmark-patches-8k/british_museum/angles.h5
12789024 ../benchmark-patches-8k/british_museum/keypoints.h5
2447913728 ../benchmark-patches-8k/british_museum/patches.h5
6414352 ../benchmark-patches-8k/british_museum/scales.h5
6414352 ../benchmark-patches-8k/british_museum/scores.h5

You can also extract patches with a fixed orientation with the flag --force_upright=no-dups-more-points: this option will filter out duplicate orientations and add more points until it reaches the keypoint budget (if possible).

python detect_sift_keypoints_and_extract_patches.py --force_upright=no-dups-more-points --folder_outp=../benchmark-patches-8k-upright-no-dups

These settings generate about (up to) 8000 features per image, which requires lowering the SIFT detection threshold. If you want fewer features (~2k), you may want to use the default detection threshold, as the results are typically slightly better:

python detect_sift_keypoints_and_extract_patches.py --n_keypoints 2048 --folder_outp=../benchmark-patches-default --lower_sift_threshold=False
python detect_sift_keypoints_and_extract_patches.py --n_keypoints 2048 --force_upright=no-dups-more-points --folder_outp=../benchmark-patches-default-upright-no-dups --lower_sift_threshold=False

After this you can extract features with run_<method>.sh, or following the instructions below. The shell scripts use reasonable defaults: please refer to each individual wrapper for further settings (upright patches, different NMS, etc).

Extracting descriptors from pre-generated patches

For HardNet (environment hardnet):

python extract_descriptors_hardnet.py

For SOSNet (environment hardnet):

python extract_descriptors_sosnet.py

For L2Net (environment hardnet):

python extract_descriptors_l2net.py

The Log-Polar Descriptor (environment hardnet) requires access to the original images. For the log-polar models, use:

python extract_descriptors_logpolar.py --config_file=third_party/log_polar_descriptors/configs/init_one_example_ptn_96.yml --method_name=sift8k_8000_logpolar96

and for the cartesian models, use:

python extract_descriptors_logpolar.py --config_file=third_party/log_polar_descriptors/configs/init_one_example_stn_16.yml --method_name=sift8k_8000_cartesian16

For Geodesc (environment geodesc):

wget http://home.cse.ust.hk/~zluoag/data/geodesc.pb -O third_party/geodesc/model/geodesc.pb
python extract_descriptors_geodesc.py

Check the files for more options.

End-to-end methods

Superpoint

Use environment hardnet. Keypoints are sorted by score and only the top num_kp are kept. You can extract features with default parameters with the following:

python third_party/superpoint_forked/superpoint.py --cuda --num_kp=2048 --method_name=superpoint_default_2048

You can also lower the detection threshold to extract more features, and resize the images to a fixed size (on the largest dimension), e.g.:

python third_party/superpoint_forked/superpoint.py --cuda --num_kp=8000 --conf_thresh=0.0001 --nms_dist=2 --resize_image_to=1024 --num_kp=8000 --method_name=superpoint_8k_resize1024_nms2

D2-Net

Use environment hardnet. Following D2-Net's settings, you can generate text lists of the images with:

python generate_image_lists.py

Download the weights (use this set, as the default has some overlap with out test subset):

mkdir third_party/d2net/models
wget https://dsmn.ml/files/d2-net/d2_tf_no_phototourism.pth -O third_party/d2net/models/d2_tf_no_phototourism.pth

You can then extract single-scale D2-Net features with:

python extract_d2net.py --num_kp=8000 --method_name=d2net-default_8000

and multi-scale D2-Net features (add the --cpu flag if your GPU runs out of memory) with:

python extract_d2net.py --num_kp=8000 --multiscale --method_name=d2net-multiscale_8000

(If the multi-scale variant crashes, please check this.)

ContextDesc

Use environment hardnet and download the model weights:

mkdir third_party/contextdesc/pretrained
wget https://research.altizure.com/data/contextdesc_models/contextdesc_pp.tar -O third_party/contextdesc/pretrained/contextdesc_pp.tar
wget https://research.altizure.com/data/contextdesc_models/retrieval_model.tar -O third_party/contextdesc/pretrained/retrieval_model.tar
wget https://research.altizure.com/data/contextdesc_models/contextdesc_pp_upright.tar -O third_party/contextdesc/pretrained/contextdesc_pp_upright.tar
tar -C third_party/contextdesc/pretrained/ -xf third_party/contextdesc/pretrained/contextdesc_pp.tar
tar -C third_party/contextdesc/pretrained/ -xf third_party/contextdesc/pretrained/contextdesc_pp_upright.tar
tar -C third_party/contextdesc/pretrained/ -xf third_party/contextdesc/pretrained/retrieval_model.tar
rm third_party/contextdesc/pretrained/contextdesc_pp.tar
rm third_party/contextdesc/pretrained/contextdesc_pp_upright.tar
rm third_party/contextdesc/pretrained/retrieval_model.tar

Generate the .yaml file for ContextDesc:

python generate_yaml.py --num_keypoints=8000

Extract ContextDesc:

python third_party/contextdesc/evaluations.py --config yaml/imw-2020.yaml

You may delete the tmp folder after extracting the features:

rm -rf ../benchmark-features/tmp_contextdesc

DELF

You can install DELF from the tensorflow models repository, following these instructions.

You have to download the model:

mkdir third_party/tensorflow_models/research/delf/delf/python/examples/parameters/
wget http://storage.googleapis.com/delf/delf_gld_20190411.tar.gz -O third_party/tensorflow_models/research/delf/delf/python/examples/parameters/delf_gld_20190411.tar.gz
tar -C third_party/tensorflow_models/research/delf/delf/python/examples/parameters/ -xvf third_party/tensorflow_models/research/delf/delf/python/examples/parameters/delf_gld_20190411.tar.gz

and add the folder third_party/tensorflow_models/research to $PYTHONPATH. See run_delf.py for usage.

LF-Net

Use environment lfnet and download the model weights:

mkdir third_party/lfnet/release
wget https://cs.ubc.ca/research/kmyi_data/files/2018/lf-net/lfnet-norotaug.tar.gz -O third_party/lfnet/release/lfnet-norotaug.tar.gz
tar -C third_party/lfnet/release/ -xf third_party/lfnet/release/lfnet-norotaug.tar.gz

Use environment 'lfnet'. Refer to extract_lfnet.py for more options. Extract LF-Net with default 2K keypoints and without resize image:

python extract_lfnet.py --out_dir=../benchmark-features/lfnet

R2D2

Use the environment r2d2-python-3.6 (requires 3.6 for f-strings). For options, please see the script. The authors provide three pre-trained models which can be used with:

python extract_r2d2.py --model=third_party/r2d2/models/r2d2_WAF_N16.pt --num_keypoints=8000 --save_path=../benchmark-features/r2d2-waf-n16-8k
python extract_r2d2.py --model=third_party/r2d2/models/r2d2_WASF_N16.pt --num_keypoints=8000 --save_path=../benchmark-features/r2d2-wasf-n16-8k
python extract_r2d2.py --model=third_party/r2d2/models/r2d2_WASF_N8_big.pt --num_keypoints=8000 --save_path=../benchmark-features/r2d2-wasf-n8-big-8k

VLFeat features (via Matlab)

Matlab-based features are in a separate repository. You can run:

./run_vlfeat_alone.sh
./run_vlfeat_with_affnet_and_hardnet.sh