
Programs written for my Großer Beleg

Primary LanguageJupyter Notebook

Programs written for my Beleg "Learning Depth Estimation from RGB and Infrared Input"

This repository contains all programs I've written for my "Großer Beleg" at the Technische Universität Dresden. Languages utilized are Python (3.6) and C++. This documentation lists all important programs as well as their usage and requirements / dependencies. In general, the programs should be commented in a way that allows understanding them. Command line arguments can always be viewed by using python Program.py --help


Description: This python script contains the network architecture of the trained model. Input images are loaded with a custom data generator. Trains the model and creates a plot for the learning rate schedule. It also saves the training history as pickle file and the final model as .h5 file.


Package Link
Numpy link
Keras link
Matplotlib link
argparse link
OpenCV link

Command Line Arguments:

Argument Description
-t, --train Path to folder that contains the training and validation examples
-x, --output Path to folder where all output is saved to
-b, --batch_size Batch size to train the network with
-e, --epochs Number of epochs to train the network on
-o, --optimizer The optimizer to utilize for training. Supported are SGD, Adam and RMSprop
-l, --loss Loss function to utilize. Either MMAE, MMAE_simple, MRMSE or MRMSE_simple. Defaults to MMAE_simple
-p, --periods Number of epochs after which to save the current model (and its weights). 1 means every epoch. Currently disabled due to a bug that freezes the training process on Taurus.
-d, --decay Reduce learning rate after every x epochs. Defaults to 10
-f, --factor_decay Factor to reduce the learning rate. Defaults to 0.5
--default_optimizers Enable all keras optimizers, not only SGD, Adam and RMSprop. This will deactivate learning rate decay
--omit_batchnorm Don't add batch normalization layers after convolutions
-m, --momentum Momentum used in batch normalization layers. Defaults to 0.99. If validation loss oscillates, try lowering it (e.g. to 0.6)
--skip_0 Functionality of S0 skip connections. One of the following: 'add', 'concat', 'concat+' or 'disable'. Defaults to 'add'. 'Concat+' adds convolutions after concatenating
--sgd_momentum Only works when using SGD optimizer: Not specified/'None': no momentum, 'normal': momentum with value from --sgd_momentum_value, 'nesterov': Use nesterov momentum with value from --sgd_momentum_value
--sgd_momentum_value Only works when using SGD optimizer: Momentum value for SGD optimizer. Enable by using --sgd_momentum. Defaults to 0.9
--output_sigmoid_activation Adds an sigmoid activation function to the output layer. The provided argument defines whether the ground truth is scaled to also fit this interval ('scale_input') or if the predictions get scaled in the loss function ('scale_output'). Defaults to '' (no sigmoid activation function added)
--no_shuffle Disables shuffling of batches for each epoch
--no_scale Disables scaling of input images to the range of [0,1]

Parameters for the proposed Final Model: The proposed final model can be trained with the following command line arguments, assuming <path_to_training_images> is the path to the training data and <output_directory> contains the path where the output of the network should be saved to: python Model_VGG_Style.py -t <path_to_training_images> -x <output_directory> -b 8 -e 100 -o sgd -l mmae_simple -d 10 -f 0.5 --sgd_momentum nesterov --sgd_momentum_value 0.8. Learning/batch_script_training shows an example batch script that was used to start the training process on Taurus.


Description: Predicts a depth image for the provided images (see command line arguments). Outputs:

  • Histograms of inputs and predictions (at the moment quite ugly)
  • Colorization attempts for the predicted depth image (at the moment quite ugly)
  • Difference Image, indicating if the prediction deviates more than a specified threshold from the ground truth (see command line arguments)
  • Unprocessed* predicted depth image (* only clipped to [0,65535])


Package Link
Numpy link
Keras link
Matplotlib link
argparse link
OpenCV link

Command Line Arguments:

Argument Description
-f, --folder If multiple depth images should be predicted, the color and infrared images, along with optional ground truth depth images, should be placed in a folder with subfolders 'Color', 'Infrared' and optionally 'Depth'. This is the path to this folder
-c, --color If only a single depth image should be predicted, this is the corresponding color image
-i, --infrared If only a single depth image should be predicted, this is the corresponding infrared image.
-g, --ground_truth If only a single depth image should be predicted, this is the corresponding ground truth depth image. Can be ignored if no ground truth is available
-b, --batch_size When multiple depth images should be predicted, this is the batch size that should be utilized while predicting
-m, --model Path to the model that should be utilized for predicting depth images
-o, --output Path to where the predicted images should be saved to
--no_scaling Don't scale the input images to the range [0,1]
--default_loss Use the default mean absolute error loss funtion. Should not be utilized
-t, --threshold_offset Offset for depth image normalization. Defaults to 2000. Only utilized if ground truth is given
--old_model For old models, the loss function was called binary mean absolut error. Activate this if an 'Unknown loss function' error is thrown. Should not be utilized
-d, --difference_threshold Utilized for difference visualization of ground truth and prediction. Maximum difference between ground truth and prediction in meters which is considered ok. Defaults to 0.05
--depth_scale_text Text file containing depth scale of the utilized depth camera. Alternatively, use --depth_scale_value to directly provide a float
--depth_scale_value Depth scale of the utilized depth camera. Alternatively, provide text file containing this scale with --depth_scale_text


Description: Used to predict depth images from a streaming RealSense depth camera or a prerecorded rosbag file. Additionally, a colorization of the predicted depth image is done (at the moment poorly). Note that the currently trained models are not fast enough to predict images in real time with a reasonable frame rate. The current implementation was not tested with a live streaming camera.


Package Link
Numpy link
Keras link
pyrealsense2 link
argparse link
OpenCV link

Command Line Arguments:

Argument Description
-p, --playback_path Path to a recorded sequence that should be predicted. Defaults to None (i.e. streaming configuration)
--no_realtime Disables real time mode. Currently does nothing
-m, --model Path to model that should be utilized for predictions
--scale_output Scale the output of the network
--no_clip Don't clip the output predictions to [0,65535]. Should not be utilized, since values outside of [0,65535] will under/overflow


Description: Jupyter notebook file that explains how the proposed model can be used to remove artifacts from ground truth depth images.


Description: This script demonstrates how to read frames from a recorded rosbag file in Python. To actually execute the script, uncomment the respective lines in the script.


Package Link
Numpy link
pyrealsense2 link
argparse link
OpenCV link

Command Line Arguments:

Argument Description
-i, --input Path the ROSBAG file that should be read in


Description: IPython Notebook that demonstrates how a 16-bit depth image can be colorized using OpenCV.


Package Link
Numpy link
Matplotlib link
OpenCV link


Description: This script is used to read frames from recorded rosbag files, apply preprocessing to them and save them in a folder structure that can be used for training the network. Execution of this script on a large file can take quite some time. There is also a C++ script for this task, which is marginally faster.


Package Link
Numpy link
pyrealsense2 link
argparse link
OpenCV link
Tqdm link
pathlib link

Command Line Arguments:

Argument Description
-i, --input Path to main folder that contains subfolders 'Outdoor_Lighting' and 'Indoor_Lighting' which contain the recorded bag files
-o, --output Path where the included images should be saved to
-d, --decimation Decimation filter magnitude. Values of 2 and 3 perform median downsampling, values greater than 3 perform mean downsampling. Should be 1 if no decimation operation should be done
-s, --skip Should scenes that already exist in the output location be skipped. Default is False. If activated, _Part_2.bag files will not be considered. This can also be used to only process .bag files that are not yet processed
-v, --verbose Verbosity settings. 0 - no extra messages. 1 - basic debug messages. 2 - additional runtime messages


Description: Slightly modified version of Visualization/ImagePreprocessing.py. This script takes into account that only every x-th frame should be sampled from the rosbag files. It is also possible to define two files that should not be sampled from (for validation and test set).


Package Link
Numpy link
pyrealsense2 link
argparse link
OpenCV link
Tqdm link
pathlib link

Command Line Arguments:

Argument Description
-i, --input Path to main folder that contains subfolders 'Outdoor_Lighting' and 'Indoor_Lighting' which contain the recorded bag files
-o, --output Path where the included images should be saved to
-s, --subsample The subsampling rate
-v, --verbose Verbosity settings. 0 - no extra messages. 1 - basic debug messages. 2 - additional runtime messages
-1, --skip_test The first scene that should not be sampled from. This scene will be used as test data
-2, --skip_validation The second scene that should not be sampled from. This scene will be used as validation data


Description: Utilized to subsample images that are already exctracted from rosbag files and lie inside a folder structure with subfolders Color, Infrared and Depth. Despite the name, this script is not exclusive to validation subsampling.


Package Link
argparse link
pathlib link

Command Line Arguments:

Argument Description
-f, --folder_input The original folder whos images should be subsampled
-o, --output The location where the subsampled files should be placed
-s, --subsample The subsampling factor. Defaults to 10
-c, --color Name of the folder that contains the color images. Defaults to 'Color'
-i, --infrared Name of the folder that contains the infrared images. Defaults to 'Infrared'
-d, --depth Name of the folder that contains the depth images. Defaults to 'Depth'


Description: C++ program that demonstrates how to extract frames from a rosbag file and colorize the depth frame. Optionally, the depth frame can be aligned to the color image.

Requirements: librealsense2 and OpenCV

Command Line Arguments: [path to bag file] [Align to depth]


Description: C++ program that polls frames from a recorded rosbag file. It applies preprocessing to the frames and counts the artifacts that remain in the filtered images.

Requirements: librealsense2 and OpenCV

Command Line Arguments: [bag file] [filter option]


Description: C++ program that polls frames from a recorded rosbag file. Applies preprocessing to the frames.

Requirements: librealsense2 and OpenCV

Command Line Arguments: [bag file] [decimation filter magnitude] [hole filling filter magnitude] [show non visualized depth images]


Description: C++ program that demonstrats polling of frames from a streaming RealSense device. Colorizes the depth map and optionally saves the sampled frames.

Requirements: librealsense2 and OpenCV

Command Line Arguments: [save directory] [should images be saved]