A Simple Stereo SLAM System

中文传送门

This is a simple stereo SLAM system with a deep-learning based loop closure module (Seq-CALC). As a beginner of SLAM, I made this system mainly in order to practice my coding and engineering skills to build a full SLAM system by myself.

I chose to build this system based on stereo cameras because it is easiler, without complicated work on initialization or dealing with the unknown scale. The structure of the system is simple and clear, in which I didn't apply much detailed optimization. Thus, the performance of this system is not outstanding. However, I hold the view that such a simple structure may be friendly for a SLAM beginner to study the body frame of a full SLAM system. It will be definitely a tough work for a beginner to study, for example, ORB-SLAM2, a complex system with more than 10 thousand lines of code and a lot of tricks to improve its performance.

It is truly a pleasure for me if this project can help you.

More details and perfomances of the system can be found here.

Related References

Chapter 13, Visual SLAM: From Theory to Practice

(https://github.com/gaoxiang12/slambook-en), I use the basic framework of the Stereo VO in the chapter, and the main methods in the frontend and the backend threads.

ORB-SLAM2

(https://github.com/raulmur/ORB_SLAM2), I use some modified code in ORB-SLAM2 (mainly from ORBextractor.cpp for ORB feature extracting).

Dependencies

The platform I use is Ubuntu 18.04.

C++11
Boost filesystem
Google Logging Library (glog)
OpenCV: Dowload and install instructions can be found here. I am using OpenCV 3.4.8.
Eigen: Dowload and install instructions can be found here. You can just use sudo apt-get install libeigen3-dev to install it in Ubuntu.
Sophus: Dowload and install instructions can be found here.
g2o: Dowload and install instructions can be found here.
Caffe: Dowload and install instructions can be found here. This is the dependency of Seq-CALC library. Notice that please make sure your caffe is installed in ~/caffe, or you need to change its path in CMakeLists.txt.
CUDA: Highly Recommended for faster loop detection, but caffe in CPU version is also OK.
Pangolin:Dowload and install instructions can be found here
Seq-CALC: Seq-CALC is a loop detection library based on CALC with the help of sequence match. It is maintained as a submodule in Thirdparty/SeqCALC folder.

TIPS: There may be some dependencies I missed. Please open an issue if you face any problem.

Build

Clone the repository:

git clone https://github.com/Mingrui-Yu/A-Simple-Stereo-SLAM-System.git

Clone the Thirdparty submodule:

git submodule init
git submodule update

Build:

cd A-Simple-Stereo-SLAM-System
chmod +x build.sh
./build.sh

This will create libmyslam.so in /lib folder and the executables in /bin folder.

Run the example

Up to now I have just written the example to run KITTI Stereo. The main file is at /app/run_kitti_stereo.

First, download a sequence from KITTI Database, or you can download it from Baidu Netdisk link shared by Pao Pao Robot in China.

To run the system on KITTI Stereo sequence 00:

./bin/run_kitti_stereo  config/stereo/gray/KITTI00-02.yaml  PATH_TO_DATASET_FOLDER/dataset/sequences/00

where KITTI00-02.yaml is the corresponding configuration file (including camera parameters and other parameters). It utilizes the style of ORB-SLAM2.

Besides, some parameters in configuration file are for viewing:

Camera.fps: control the frame rate of the system
LoopClosing.bShowResult: whether to show the match result and reprojection result in Loop Closing
Viewer.bShow: whether to show the frame and map in real time while system running

Here is a result of keyframe trajectory in KITTI 00.

The system can run at a frame rate of around 50 frames per second (if the viewer is closed). If you don't need to undistort the images (such as in KITTI database), it can even accelerate to around 100 frames per second. (Run on a laptop with i5-8265U(1.60GHz × 8) and no GPU)

Brief Introduction

The system contains three thread:

Frontend thread
Backend thread
LoopClosing thread

In Frontend, it will track the motion based on feature points and LK flow. If the number of tracked keypoints is lower than a thresold, it will detect new features and create a keyframe. Mappoints are created by triangulating the matched feature points in left/right images.

In Backend, it will maintain a global map and an local active map. The active map is like a sliding window, containing a fixed number of keyframes and observed mappoints. Optimization of the active map is done in Backend.

In LoopClosing, it will first try to detect a Candidate Loop KF of the Current KF using DeepLCD. If succeed, it will then match the keypoints in Candidate KF and Current KF, which is used to compute the correct pose of Current KF using PnP and g2o optimization. If the number of inliers is higher than a threshold, the loop detection will be considered as a success, and loop correction is applyed: first, it will correct the keyframe poses and mappoint positions in active map; second, a pose graph optimization of the global map will be applied.

There must be some mistakes in the project as I am just a newcomer to visual SLAM. Please open an issue if you find any problem, and I will be deeply grateful for your correction and advice.

patilnabhi/A-Simple-Stereo-SLAM-System