Zhexi Peng, Tianjia Shao, Liu Yong, Jingke Zhou, Yin Yang, Jingdong Wang, Kun Zhou
This repository contains the official authors implementation associated with the paper "RTG-SLAM: Real-time 3D Reconstruction at Scale Using Gaussian Splatting", which can be found here.
Abstract: We present Real-time Gaussian SLAM (RTG-SLAM), a real-time 3D reconstruction system with an RGBD camera for large-scale environments using Gaussian splatting. The system features a compact Gaussian representation and a highly efficient on-the-fly Gaussian optimization scheme. We force each Gaussian to be either opaque or nearly transparent, with the opaque ones fitting the surface and dominant colors, and transparent ones fitting residual colors. By rendering depth in a different way from color rendering, we let a single opaque Gaussian well fit a local surface region without the need of multiple overlapping Gaussians, hence largely reducing the memory and computation cost. For on-the-fly Gaussian optimization, we explicitly add Gaussians for three types of pixels per frame: newly observed, with large color errors, and with large depth errors. We also categorize all Gaussians into stable and unstable ones, where the stable Gaussians are expected to well fit previously observed RGBD images and otherwise unstable. We only optimize the unstable Gaussians and only render the pixels occupied by unstable Gaussians. In this way, both the number of Gaussians to be optimized and pixels to be rendered are largely reduced, and the optimization can be done in real time. We show real-time reconstructions of a variety of large scenes. Compared with the state-of-the-art NeRF-based RGBD SLAM, our system achieves comparable high-quality reconstruction but with around twice the speed and half the memory cost, and shows superior performance in the realism of novel view synthesis and camera tracking accuracy.
git clone --recursive https://github.com/MisEty/RTG-SLAM.git
RTG-SLAM has been tested on python 3.9, CUDA=11.7, pytorch=1.13.1. The simplest way to install all dependences is to use anaconda and pip in the following steps:
conda env create -f environment.yaml
We have made some changes on ORB-SLAM2 to work with our ICP front-end and you can run this script to install pangolin, opencv, orbslam and boost-python binding.
bash build_orb.sh
If you encounted the problem during install pangolin:
xxx/Pangolin/src/video/drivers/ffmpeg.cpp: In function ‘std::__cxx11::string pangolin::FfmpegFmtToString(AVPixelFormat)’:
xxx/Pangolin/src/video/drivers/ffmpeg.cpp:41:41: error: ‘AV_PIX_FMT_XVMC_MPEG2_MC’ was not declared in this scope
You can follow this solution.
For real data, backend optimization based on ORB-SLAM2 is crucial. Therefore, you need to install the python binding for ORB-SLAM2 according to the steps. We have modified some code based on ORB_SLAM2-PythonBindings . If you encounter any problem related to compilation, you can refer to ORB_SLAM2-PythonBindings to find solutions. Our ICP front-end works well when the depth is accurate so if you only want to test on synthetic dataset like Replica, you don't need to install ORB-SLAM2 python binding.
cd thirdParty/pybind/examples
python orbslam_rgbd_tum.py # please set voc_path, association_path ...
python eval_ate.py path_to_groundtruth.txt trajectory.txt --plot PLOT --verbose
If the code runs without any error and the trajetory is corret, you can move on to the next step.
bash scripts/download_replica.sh
bash scripts/download_tum.sh
Please follow ScanNet++ to download dataset. And run
python scripts/parse_scannetpp.py --data_path scannetpp_path/download/data/8b5caf3398 --output_path data/ScanNetpp/8b5caf3398
You can donwload our dataset from google cloud.
|-- data
|-- Ours
|-- hotel
|-- Replica
|-- office0
|-- results
|-- office0.ply
|-- traj.txt
|-- cam_params.json
|-- TUM_RGBD
|-- rgbd_dataset_freiburg1_desk
|-- ScanNetpp
|-- 8b5caf3398
|-- color
|-- depth
|-- intrinsic
|-- pose
|-- mesh_aligned_cull.ply
# Single Process: Recommended, More Stable
python slam.py --config ./configs/replica/office0.yaml
# Multi Process:
python slam_mp.py --config ./configs/replica/office0.yaml
# Single Process: Recommended, More Stable
python slam.py --config ./configs/tum/fr1_desk.yaml
# Multi Process:
python slam_mp.py --config ./configs/tum/fr1_desk.yaml
# Single Process: Recommended, More Stable
python slam.py --config ./configs/scannetpp/8b5caf3398.yaml
# Multi Process:
python slam_mp.py --config ./configs/scannetpp/8b5caf3398.yaml
# Single Process: Recommended, More Stable
python slam.py --config ./configs/ours/hotel.yaml
# Multi Process:
python slam_mp.py --config ./configs/ours/hotel.yaml
You can run metric.py to evaluate the rendering quality on Replica, ScanNet++ and Ours dataset and calculate geometry accuracy on Replica and ScanNet++. There will be a csv result file in model path. The tracking accuracy is estimated right after running slam.py. The ate result is in model_path/save_traj.
The script selects all images when computing psnr, lpips and ssim. Our method adds Gaussian according to the depth so the performance may decrease in the presence of significant depth noise or invalid depth (such as transparent materials, highly reflective materials, etc.). For fairness, when evaluating novel view synthesis on ScanNet++ in the paper, we manually removed images with large invalid depth areas.
python metric.py --config config_path \
# eval the first k frames
----load_frame k \
# save pictures
--save_pic
python metric.py --config ./configs/replica/office0.yaml
python metric.py --config ./configs/tum/fr1_desk.yaml
python metric.py --config ./configs/scannetpp/8b5caf3398.yaml # all novel view images
python metric.py --config ./configs/ours/hotel.yaml
@article{peng2024rtgslam,
author = {Zhexi Peng and Tianjia Shao and Liu Yong and Jingke Zhou and Yin Yang and Jingdong Wang and Kun Zhou},
title = {RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian Splatting},
booktitle = {ACM SIGGRAPH Conference Proceedings, Denver, CO, United States, July 28 - August 1, 2024},
year = {2024},
}
This project is built upon 3DGS. The ORB-SLAM backend is based on ORB-SLAM2. The ORB-SLAM2 Python Binding is based on ORB_SLAM2-PythonBindings. The evaluation script is adopted from NICE-SLAM and Point-SLAM. We thank all the authors for their great work.