Author: Zhixiang Min
Contact: zmin1@stevens.edu
VOLDOR-SLAM is a real-time dense-indirect SLAM system takes dense optical flows as input that supports monocular, stereo and RGB-D video sequence. The system logic is written in native python, which is friendly to modify and get started.
VOLDOR-SLAM: For the times when feature-based or direct methods are not good enough
Zhixiang Min, Enrique Dunn
ICRA 2021 [paper] [video]
VOLDOR: Visual Odometry from Log-logistic Dense Optical flow Residuals
Zhixiang Min, Yiding Yang, Enrique Dunn
CVPR 2020 [paper] [video]
Our system is built with cuda, cython and python. We will support the compatibility under the following configurations:
Windows 10 + Visual Studio 2017 / Ubuntu 18.04
CUDA >= 9.0
Python 3.6.X
OpenCV 3.4.X
Ceres 2.0 (Optional for mapping)
pyDBoW3 (Optional for loop closure)
- Install the python package dependencies via pip.
cd slam_py/install pip install -r .\requirements.txt
- For windows users, we provide prebuilt binaries. Download and copy the binaries files into the
demo
folder. You may also need install VC++ runtime library.
-
OpenCV (Required)
- Download and setup OpenCV 3.4.X.
- Copy
opencv_worldxxx.dll
from the opencv workspace to thedemo
folder.
-
PyOpenGL (Optional for viewer)
- PyOpenGL installed from pip requires additional
dll
file to be configured inPATH
. If you meetOpenGL.error.NullFunctionError
while launching the viewer, try install PyOpenGL from wheels PyOpenGL-wheels.
- PyOpenGL installed from pip requires additional
-
Ceres (Optional for mapping)
- Download and follow the instructions to build Ceres with SuiteSparse. (Painful)
- Copy
ceres.dll
,glog.dll
,libblas.dll
,libgcc_s_sjlj-1.dll
,libgfortran-3.dll
,liblapack.dll
,libquadmath-0.dll
to thedemo
folder. They are all ceres dependencies. You can find them under Ceres and SuiteSparse workspace.
-
pyDBoW3 (Optional for loop closure. Prerequisite building full SLAM)
- Download and follow the instructions to build pyDBoW3. The library depends on boost and opencv. You will need to provide/change the library path in the build script.
- You will get
pyDBoW3.pyd
when you succeeded. Copy it to thedemo
folder. - The vocabulary file
ORBvoc.bin
already included in thedemo
folder.
-
Build VO Only:
- Open and edit
slam_py/setup_win_vo.py
. You will need to provide/changeopencv_include_dir
,opencv_lib_dir
andopencv_lib_name
according to your opencv installation. - Run the following code to build.
cd slam_py/install python setup_win_vo.py build_ext -i
- If succeed,
gpu-kernels.dll
andpyvoldor_vo.xxx.pyd
will appear in theinstall
folder. Copy them to thedemo
folder.
- Open and edit
-
Build Full SLAM Pipeline:
- Open and edit
slam_py/setup_win_full.py
. You will need to provide/changeopencv_include_dir
,opencv_lib_dir
,opencv_lib_name
,ceres_include_dirs
andceres_lib_dirs
according to your opencv and ceres installations. - Run the following code to build. (No need building VO module first)
cd slam_py/install python setup_win_full.py build_ext -i
- If succeed,
gpu-kernels.dll
andpyvoldor_full.xxx.pyd
will appear in theinstall
folder. Copy them to thedemo
folder.
- Open and edit
-
OpenCV (Required)
- If using
sudo apt install libopencv-dev
, be aware of its version that we only tested under 3.4.X, though we found most versions work fine. - If build from source, remember running
sudo make install
.
- If using
-
PyOpenGL (Optional for viewer)
sudo apt install python-opengl
-
Ceres (Optional for mapping)
- Build Ceres with SuiteSparse follow the instructions. Remember running
sudo make install
.
- Build Ceres with SuiteSparse follow the instructions. Remember running
-
pyDBoW3 (Optional for loop closure. Prerequisite building full SLAM)
- Download and follow the instructions to build pyDBoW3.
- You will get
pyDBoW3.so
when you succeed. Copy it to thedemo
folder. - The vocabulary file
ORBvoc.bin
already included in thedemo
folder.
-
Build VO Only:
- Run the following code to build.
cd slam_py/install python setup_linux_vo.py build_ext -i
- If succeed,
libgpu-kernels.so
andpyvoldor_vo.xxx.so
will appear in theinstall
folder. Copy them to thedemo
folder.
- Run the following code to build.
-
Build Full SLAM Pipeline:
- Run the following code to build. (No need building VO module first)
cd slam_py/install python setup_linux_full.py build_ext -i
- If succeed,
libgpu-kernels.so
andpyvoldor_full.xxx.so
will appear in theinstall
folder. Copy them to thedemo
folder.
- Run the following code to build. (No need building VO module first)
Our method takes three inputs. You may download demo data to run and play.
Optical flows are REQUIRED for visual odometry. We recommend MaskFlowNet or PWC-Net as the optical flow estimator.
Supported format:.flo
Disparity maps are OPTIONAL input only present when stereo pairs or depth sensors are available. For stereo pairs, we recommend using the same optical flow estimator running from left to right. For depth sensors, you need to set a virtual baseline value to convert the depth to disparity as input.
Supported format:.flo
,.png (x256 gray-16bit)
RGB images are OPTIONAL input for adding photo-consistency to enhance the frame alignment.
Supported format:all image formats supported by opencv
cd demo
python demo.py --help
python demo.py
--fx 320 --fy 320 --cx 320 --cy 240 --bf 80
--flow_dir 'path_to_flow'
--img_dir 'path_to_img'
--disp_dir 'path_to_disp'
--mode stereo
--enable_mapping
--enable_loop_closure './ORBvoc.bin'
--save_poses './poses.txt'
--save_depths './depths'
More details can be found in the comments of the demo
script. In demo data, we also provided bash commands in run_demo.txt
under each data instance folder.
Viewer Control Manual
> q -> Exit > m -> Save point cloud to .ply > r -> Reset view > h -> Hide cameras / links > f -> Follow the current camera > w/s -> Increase/Decrease scene pixel size > d/a -> Increase/Decrease scene pixel density > x/z -> Increase/Decrease scene pixel depth range > Mouse scroll up/down -> Zoom in/out > Mouse drag with left down -> Scene rotation > Mouse drag with right down -> Scene translation
Viewer Range. When running with monocular capture, if the world scale is unfortunately initialized too large, the viewer may not display many pixels, which can be adjusted using x/z keys.
How to pick basefocal. basefocal
refers to baseline (meter)
times focal (px)
. It decides the scene scale. If the input disparity maps are from rectified stereo, the baseline
of stereo pairs should be used. In the case of depth camera or monocular, basefocal
can be used as a hyper-parameter tuning the confidence on the input depth. A proper basefocal
value should let the displayed keyframe depth map (tmpkf_depth
) shows a porper variance (neither pure white or black).
Input Frame Rate. Our framework favors a modest baseline between frames to let the optical flows being informative. For high frame rate videos, consider downsampling the frame rate based on appearance changing. (E.g. KITTI/TartanAir @ 10Hz, TUM-RGBD @ 3Hz.)
I/O BottleNeck. If you observe choppy GPU load or the latest map on the viewer colored gray, it probably means the disk I/O is bottlenecking the performance. This will affect the accuracy by blocking the local mapping while VO keeps running. Consider move your data to SSD disk or you may restart the software a few times that your disk will cache more data.
Parameter Tuning. Usually default parameters work for all. You may try loosing the loop closure thresholds to detect more loop closures. For very challenging dataset, if the full pipeline is not stable, try disable local mapping. For tuning VO parameters, pass parameters using slam.voldor_user_config
. All available parameters can be found in voldor/config.h
with descriptions. For SLAM parameters, check the __init__
method of VOLDOR_SLAM
.
What is mono-scaled mode? Mono-scaled mode is to use the disparity map only for correcting the world scale.
The software is released under the attached LICENSE
file.
Parts of the software rely on the following open source projects:
LambdaTwist, OpenCV, pyDBoW3, DBoW3, Ceres, SuiteSparse, Eigen, svd3, PyOpenGL