This demo uses the main camera to capture frames and calculate depth maps using the neural network MiDaS[1,2].
By pressing the keys 1 and 2 you can switch between the depth map calculation and an interactive point cloud representation of the last captured frame.
[1] René Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, & Vladlen Koltun (2022). Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3).
[2] René Ranftl, Alexey Bochkovskiy, & Vladlen Koltun (2021). Vision Transformers for Dense Prediction. ICCV.