The stereo matching based on convolutional neural network has achieved a great improvement in accuracy, but the most still cannot meet the real-time requirements. A real-time stereo matching with a coarse-to-fine fashion is proposed, which initializes the disparity map at the low-resolution level and gradually restores the spatial resolution of the disparity map. The algorithm uses a lightweight backbone network to extract multi-scale features, and at the same time the features are inversely fused to improve the robustness of the features while ensuring the real-time performance. A multi-branch fusion(MBF) module is proposed to progressively refine the disparity map. The multi-modes of different regions are automatically clustered and processed separately, and then the final result is combined according to the cluster weights, so that the regions with different characteristic can be better processed.
Usage of KITTI and SceneFlow dataset in stereo/dataloader/README.md
Reference to demos/train_sfk_all.sh and demos/train_sfk.sh.
Reference to demos/submission_all.sh and demos/submission.sh.
Reults on KITTI 2015 leaderboard
Method | D1-all (All) | D1-all (Noc) | Runtime (s) | Environment |
---|---|---|---|---|
DeepPruner(fast) | 2.59 % | 2.35 % | 0.06 | Titan XP (Caffe) |
MBFnet(our) | 2.96 % | 2.54 % | 0.05 | 2070 (pytorch) |
DispNetC | 4.32 % | 4.05 % | 0.06 | Titan XP (Caffe) |
MADnet | 4.66 % | 4.27 % | 0.02 | 1080Ti (tensorflow) |
Any discussions or concerns are welcomed!