Python scripts form performing stereo depth estimation using the HITNET model in ONNX.
Stereo depth estimation on the cones images from the Middlebury dataset (https://vision.middlebury.edu/stereo/data/scenes2003/)
- OpenCV, imread-from-url, onnx and onnxruntime. Also, pafy and youtube-dl are required for youtube video inference. The depthai library is also necessary in the case of the OAK-D boards (https://docs.luxonis.com/projects/api/en/latest/install/)
pip install -r requirements.txt
pip install pafy youtube-dl
The original models were converted to different formats (including .onnx) by PINTO0309, download the models from his repository and save them into the models folder.
The Tensorflow pretrained model was taken from the original repository.
- Depthai OAK-D series inference on the host:
python depthai_host_depth_estimation.py
- Image inference:
python image_depth_estimation.py
- Video inference:
python video_depth_estimation.py
- DrivingStereo dataset inference:
python driving_stereo_test.py
For performing the inference in Tensorflow, check my other repository HITNET Stereo Depth estimation.
For performing the inference in TFLite, check my other repository TFLite HITNET Stereo Depth estimation.
- Hitnet model: https://github.com/google-research/google-research/tree/master/hitnet
- PINTO0309's model zoo: https://github.com/PINTO0309/PINTO_model_zoo
- PINTO0309's model conversion tool: https://github.com/PINTO0309/openvino2tensorflow
- DrivingStereo dataset: https://drivingstereo-dataset.github.io/
- Original paper: https://arxiv.org/abs/2007.12140
- Depthai Python library: https://github.com/luxonis/depthai-python