/FDANET

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

Primary LanguagePython

A Deep Feature Aggregation Network for Accurate Indoor Camera Localization

This is the PyTorch implementation of our ra-l 2022 and icra 2022 paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

Installation

  • To run our model, set up python3 environment from requirement.txt::
pip3 install -r requirement.txt 
  • To compute pose by RANSAC-based PnP algorithm, you will need to build the cython module:
cd ./pnpransac
rm -rf build
python setup.py build_ext --inplace

Datasets

  • 7-Scenes: Download the dataset from the website.
  • 12-Scenes: Download the dataset from the website.
  • our actual environment:Besides experiments on open-source datasets, one actual environment is implemented to verify the effectiveness of the proposed method. The dataset, which includes three training sequences and one testing sequence, is recorded from a hand-held Kinect RGB-D camera. We use an implementation of the KinectFusion system to obtain the ground truth poses. Download the dataset from here.

Training and evaluating

  • Training on 7-Scenes dataset:
CUDA_VISIBLE_DEVICES=gpu_id python main.py --model fdanet --dataset 7S --scene chess --data_path ./data/ --flag train 
  • training on 12-Scenes dataset:
CUDA_VISIBLE_DEVICES=gpu_id python main.py --model fdanet --dataset 12S --scene office2/5b --data_path ./data/ --flag train 
  • evaluating on 7-Scenes dataset:
CUDA_VISIBLE_DEVICES=gpu_id python main.py --model fdanet --dataset 7S --scene chess --data_path ./data/ --flag test --resume model_path
  • evaluating on 12-Scenes dataset:

CUDA_VISIBLE_DEVICES=gpu_id python main.py --model fdanet --dataset 12S --scene office2/5b --data_path ./data/ --flag test --resume model_path 

Examples:

CUDA_VISIBLE_DEVICES=1 python main.py --model fdanet --dataset 7S --scene chess --data_path ./data/ --flag test --resume /mnt/sda2/xt/7scenes/lap/chess_supervise_PFFM.pth

Pretrained Model.

We provide our pre-trained models on 7-Scenes link and on 12-Scenes link and our actual environment link. Download them and move them to their according folders. For example, move the chess_supervise_PFFM.pth to train_log/7-Scenes/. Then revise --resume train_log/7-Scenes/chess_supervise_PFFM.pth.

Results

  • Evaluation result on 7-Scenes dataset:

Image text

  • Evaluation result on 12-Scenes dataset:

Image text

  • Evaluation result on actual_environment: Our proposed method achieves 0.032m and 1.69◦ in terms of average positional error and rotational error respectively, which is acceptable for camera localization applications. Image text