This repository contains all programs I've written for my "Großer Beleg" at the Technische Universität Dresden.
Languages utilized are Python
(3.6) and C++
. This documentation lists all important programs as well as their usage and requirements / dependencies. In general, the programs should be commented in a way that allows understanding them. Command line arguments can always be viewed by using python Program.py --help
Description: This python script contains the network architecture of the trained model. Input images are loaded with a custom data generator. Trains the model and creates a plot for the learning rate schedule. It also saves the training history as pickle
file and the final model as .h5
file.
Requirements:
Package | Link |
---|---|
Numpy | link |
Keras | link |
Matplotlib | link |
argparse | link |
OpenCV | link |
Command Line Arguments:
Argument | Description |
---|---|
-t, --train | Path to folder that contains the training and validation examples |
-x, --output | Path to folder where all output is saved to |
-b, --batch_size | Batch size to train the network with |
-e, --epochs | Number of epochs to train the network on |
-o, --optimizer | The optimizer to utilize for training. Supported are SGD, Adam and RMSprop |
-l, --loss | Loss function to utilize. Either MMAE, MMAE_simple, MRMSE or MRMSE_simple. Defaults to MMAE_simple |
-p, --periods | Number of epochs after which to save the current model (and its weights). 1 means every epoch. Currently disabled due to a bug that freezes the training process on Taurus. |
-d, --decay | Reduce learning rate after every x epochs. Defaults to 10 |
-f, --factor_decay | Factor to reduce the learning rate. Defaults to 0.5 |
--default_optimizers | Enable all keras optimizers, not only SGD, Adam and RMSprop. This will deactivate learning rate decay |
--omit_batchnorm | Don't add batch normalization layers after convolutions |
-m, --momentum | Momentum used in batch normalization layers. Defaults to 0.99. If validation loss oscillates, try lowering it (e.g. to 0.6) |
--skip_0 | Functionality of S0 skip connections. One of the following: 'add', 'concat', 'concat+' or 'disable'. Defaults to 'add'. 'Concat+' adds convolutions after concatenating |
--sgd_momentum | Only works when using SGD optimizer: Not specified/'None': no momentum, 'normal': momentum with value from --sgd_momentum_value, 'nesterov': Use nesterov momentum with value from --sgd_momentum_value |
--sgd_momentum_value | Only works when using SGD optimizer: Momentum value for SGD optimizer. Enable by using --sgd_momentum. Defaults to 0.9 |
--output_sigmoid_activation | Adds an sigmoid activation function to the output layer. The provided argument defines whether the ground truth is scaled to also fit this interval ('scale_input') or if the predictions get scaled in the loss function ('scale_output'). Defaults to '' (no sigmoid activation function added) |
--no_shuffle | Disables shuffling of batches for each epoch |
--no_scale | Disables scaling of input images to the range of [0,1] |
Parameters for the proposed Final Model: The proposed final model can be trained with the following command line arguments, assuming <path_to_training_images> is the path to the training data and <output_directory> contains the path where the output of the network should be saved to: python Model_VGG_Style.py -t <path_to_training_images> -x <output_directory> -b 8 -e 100 -o sgd -l mmae_simple -d 10 -f 0.5 --sgd_momentum nesterov --sgd_momentum_value 0.8
. Learning/batch_script_training shows an example batch script that was used to start the training process on Taurus.
Description: Predicts a depth image for the provided images (see command line arguments). Outputs:
- Histograms of inputs and predictions (at the moment quite ugly)
- Colorization attempts for the predicted depth image (at the moment quite ugly)
- Difference Image, indicating if the prediction deviates more than a specified threshold from the ground truth (see command line arguments)
- Unprocessed* predicted depth image (* only clipped to [0,65535])
Requirements:
Package | Link |
---|---|
Numpy | link |
Keras | link |
Matplotlib | link |
argparse | link |
OpenCV | link |
Command Line Arguments:
Argument | Description |
---|---|
-f, --folder | If multiple depth images should be predicted, the color and infrared images, along with optional ground truth depth images, should be placed in a folder with subfolders 'Color', 'Infrared' and optionally 'Depth'. This is the path to this folder |
-c, --color | If only a single depth image should be predicted, this is the corresponding color image |
-i, --infrared | If only a single depth image should be predicted, this is the corresponding infrared image. |
-g, --ground_truth | If only a single depth image should be predicted, this is the corresponding ground truth depth image. Can be ignored if no ground truth is available |
-b, --batch_size | When multiple depth images should be predicted, this is the batch size that should be utilized while predicting |
-m, --model | Path to the model that should be utilized for predicting depth images |
-o, --output | Path to where the predicted images should be saved to |
--no_scaling | Don't scale the input images to the range [0,1] |
--default_loss | Use the default mean absolute error loss funtion. Should not be utilized |
-t, --threshold_offset | Offset for depth image normalization. Defaults to 2000. Only utilized if ground truth is given |
--old_model | For old models, the loss function was called binary mean absolut error. Activate this if an 'Unknown loss function' error is thrown. Should not be utilized |
-d, --difference_threshold | Utilized for difference visualization of ground truth and prediction. Maximum difference between ground truth and prediction in meters which is considered ok. Defaults to 0.05 |
--depth_scale_text | Text file containing depth scale of the utilized depth camera. Alternatively, use --depth_scale_value to directly provide a float |
--depth_scale_value | Depth scale of the utilized depth camera. Alternatively, provide text file containing this scale with --depth_scale_text |
Description: Used to predict depth images from a streaming RealSense depth camera or a prerecorded rosbag
file. Additionally, a colorization of the predicted depth image is done (at the moment poorly). Note that the currently trained models are not fast enough to predict images in real time with a reasonable frame rate. The current implementation was not tested with a live streaming camera.
Requirements:
Package | Link |
---|---|
Numpy | link |
Keras | link |
pyrealsense2 | link |
argparse | link |
OpenCV | link |
Command Line Arguments:
Argument | Description |
---|---|
-p, --playback_path | Path to a recorded sequence that should be predicted. Defaults to None (i.e. streaming configuration) |
--no_realtime | Disables real time mode. Currently does nothing |
-m, --model | Path to model that should be utilized for predictions |
--scale_output | Scale the output of the network |
--no_clip | Don't clip the output predictions to [0,65535]. Should not be utilized, since values outside of [0,65535] will under/overflow |
Description: Jupyter notebook file that explains how the proposed model can be used to remove artifacts from ground truth depth images.
Description: This script demonstrates how to read frames from a recorded rosbag
file in Python. To actually execute the script, uncomment the respective lines in the script.
Requirements:
Package | Link |
---|---|
Numpy | link |
pyrealsense2 | link |
argparse | link |
OpenCV | link |
Command Line Arguments:
Argument | Description |
---|---|
-i, --input | Path the ROSBAG file that should be read in |
Description: IPython Notebook that demonstrates how a 16-bit depth image can be colorized using OpenCV.
Requirements:
Package | Link |
---|---|
Numpy | link |
Matplotlib | link |
OpenCV | link |
Description: This script is used to read frames from recorded rosbag
files, apply preprocessing to them and save them in a folder structure that can be used for training the network. Execution of this script on a large file can take quite some time. There is also a C++
script for this task, which is marginally faster.
Requirements:
Package | Link |
---|---|
Numpy | link |
pyrealsense2 | link |
argparse | link |
OpenCV | link |
Tqdm | link |
pathlib | link |
Command Line Arguments:
Argument | Description |
---|---|
-i, --input | Path to main folder that contains subfolders 'Outdoor_Lighting' and 'Indoor_Lighting' which contain the recorded bag files |
-o, --output | Path where the included images should be saved to |
-d, --decimation | Decimation filter magnitude. Values of 2 and 3 perform median downsampling, values greater than 3 perform mean downsampling. Should be 1 if no decimation operation should be done |
-s, --skip | Should scenes that already exist in the output location be skipped. Default is False. If activated, _Part_2.bag files will not be considered. This can also be used to only process .bag files that are not yet processed |
-v, --verbose | Verbosity settings. 0 - no extra messages. 1 - basic debug messages. 2 - additional runtime messages |
Description: Slightly modified version of Visualization/ImagePreprocessing.py. This script takes into account that only every x-th frame should be sampled from the rosbag
files. It is also possible to define two files that should not be sampled from (for validation and test set).
Requirements:
Package | Link |
---|---|
Numpy | link |
pyrealsense2 | link |
argparse | link |
OpenCV | link |
Tqdm | link |
pathlib | link |
Command Line Arguments:
Argument | Description |
---|---|
-i, --input | Path to main folder that contains subfolders 'Outdoor_Lighting' and 'Indoor_Lighting' which contain the recorded bag files |
-o, --output | Path where the included images should be saved to |
-s, --subsample | The subsampling rate |
-v, --verbose | Verbosity settings. 0 - no extra messages. 1 - basic debug messages. 2 - additional runtime messages |
-1, --skip_test | The first scene that should not be sampled from. This scene will be used as test data |
-2, --skip_validation | The second scene that should not be sampled from. This scene will be used as validation data |
Description: Utilized to subsample images that are already exctracted from rosbag files and lie inside a folder structure with subfolders Color
, Infrared
and Depth
. Despite the name, this script is not exclusive to validation subsampling.
Requirements:
Package | Link |
---|---|
argparse | link |
pathlib | link |
Command Line Arguments:
Argument | Description |
---|---|
-f, --folder_input | The original folder whos images should be subsampled |
-o, --output | The location where the subsampled files should be placed |
-s, --subsample | The subsampling factor. Defaults to 10 |
-c, --color | Name of the folder that contains the color images. Defaults to 'Color' |
-i, --infrared | Name of the folder that contains the infrared images. Defaults to 'Infrared' |
-d, --depth | Name of the folder that contains the depth images. Defaults to 'Depth' |
Description: C++
program that demonstrates how to extract frames from a rosbag
file and colorize the depth frame. Optionally, the depth frame can be aligned to the color image.
Requirements: librealsense2 and OpenCV
Command Line Arguments: [path to bag file] [Align to depth]
Description: C++
program that polls frames from a recorded rosbag
file. It applies preprocessing to the frames and counts the artifacts that remain in the filtered images.
Requirements: librealsense2 and OpenCV
Command Line Arguments: [bag file] [filter option]
Description: C++
program that polls frames from a recorded rosbag
file. Applies preprocessing to the frames.
Requirements: librealsense2 and OpenCV
Command Line Arguments: [bag file] [decimation filter magnitude] [hole filling filter magnitude] [show non visualized depth images]
Description: C++
program that demonstrats polling of frames from a streaming RealSense device. Colorizes the depth map and optionally saves the sampled frames.
Requirements: librealsense2 and OpenCV
Command Line Arguments: [save directory] [should images be saved]