/deep-orientation

Fast and robust upper body orientation estimation for mobile robotic applications

Primary LanguagePythonOtherNOASSERTION

Deep Orientation

Fast and Robust Upper Body Orientation Estimation for Mobile Robotic Applications using Tensorflow (Python 3.6).

Installation

  1. Clone repository:

    git clone https://github.com/tui-nicr/deep-orientation.git
  2. Set up environment and install dependencies:

    # create Python 3.6 environment
    conda create --name env_deep_orientation python=3.6
    conda activate env_deep_orientation
    
    # install dependencies
    # GPU version:
    pip install -r /path/to/this/repository/requirements_gpu.txt [--user]
    # CPU version:
    pip install -r /path/to/this/repository/requirements_cpu.txt [--user]
    
    # opt. dependencies to plot models, see src/plot_models.py
    conda install graphviz pydot
  3. For network training: Download NICR RGB-D Orientation Data Set and install dataset package

Apply already trained network for orientation estimation

  • Change directory to src

    cd /path/to/this/repository/src
  • Apply best performing modified beyer architecture network (depth input, biternion output, mean absolute error of 5.28° on test set)

    # single prediction (without dropout sampling)
    python inference.py \
        beyer_mod_relu \
        ../trained_networks/beyer_mod_relu__depth__126x48__biternion__0_030000__1/weights_valid_0268.hdf5 \
        ../nicr_rgb_d_orientation_data_set_examples/small_patches \
        --input_type depth \
        --input_preprocessing standardize \
        --input_height 126 \
        --input_width 48 \
        --output_type biternion \
        --n_samples 1
        [--cpu]
    

    beyer_mod_relu_depth_without_sampling

    # multiple predictions to estimate uncertainty (with dropout sampling)
    python inference.py \
        beyer_mod_relu \
        ../trained_networks/beyer_mod_relu__depth__126x48__biternion__0_030000__1/weights_valid_0268.hdf5 \
        ../nicr_rgb_d_orientation_data_set_examples/small_patches \
        --input_type depth \
        --input_preprocessing standardize \
        --input_height 126 \
        --input_width 48 \
        --output_type biternion \
        --n_samples 25
        [--cpu]      

    beyer_mod_relu_depth_with_sampling

  • Apply best performing MobileNet v2 architecture network (depth input, biternion output, mean absolute error of 5.17° on test set)

    # single prediction (without dropout sampling)
    python inference.py \
        mobilenet_v2 \
        ../trained_networks/mobilenet_v2_1_00__depth__96x96__biternion__0_001000__2/weights_valid_0268.hdf5 \
        ../nicr_rgb_d_orientation_data_set_examples/small_patches \
        --input_type depth \
        --input_preprocessing scale01 \
        --input_height 96 \
        --input_width 96 \
        --output_type biternion \
        --mobilenet_v2_alpha 1.00 \
        --n_samples 1
        [--cpu]      

    mobilenet_v2_depth_without_sampling

  • Apply best performing RGB MobileNet v2 architecture network (rgb input, biternion output, mean absolute error of 7.98° on test set)

    # single prediction (without dropout sampling)
    python inference.py \
        mobilenet_v2 \
        ../trained_networks/mobilenet_v2_1_00__rgb__96x96__biternion__0_001000__0/weights_valid_0134.hdf5 \
        ../nicr_rgb_d_orientation_data_set_examples/small_patches \
        --input_type rgb \
        --input_preprocessing scale01 \
        --input_height 96 \
        --input_width 96 \
        --output_type biternion \
        --mobilenet_v2_alpha 1.00 \
        --n_samples 1
        [--cpu]     

    mobilenet_v2_rgb_without_sampling

Train neural network for orientation estimation

  1. Change directory to src

    cd /path/to/this/repository/src
  2. Extract patches

    python extract_patches.py --dataset_basepath /path/to/nicr_rgb_d_orientation_data_set
  3. Train (GPU only) multiple neural networks with same hyperparameters

    # best performing configuration and hyperparameters from paper
    python train.py \
        beyer_mod_relu \
        --dataset_basepath /path/to/nicr_rgb_d_orientation_data_set \
        --output_basepath /path/where/to/store/training/output/files \
        --input_type depth \
        --input_preprocessing standardize \
        --input_height 126 \
        --input_width 48 \
        --output_type biternion \
        --learning_rate 0.03 \
        --run_id 0
    python train.py \
        beyer_mod_relu \
        --dataset_basepath /path/to/nicr_rgb_d_orientation_data_set \
        --output_basepath /path/where/to/store/training/output/files \
        --input_type depth \
        --input_preprocessing standardize \
        --input_height 126 \
        --input_width 48 \
        --output_type biternion \
        --learning_rate 0.03 \
        --run_id 1
    python train.py \
        beyer_mod_relu \
        --dataset_basepath /path/to/nicr_rgb_d_orientation_data_set \
        --output_basepath /path/where/to/store/training/output/files \
        --input_type depth \
        --input_preprocessing standardize \
        --input_height 126 \
        --input_width 48 \
        --output_type biternion \
        --learning_rate 0.03 \
        --run_id 2

    For further details and parameters, see:

    python train.py --help
    usage: train.py [-h] [-o OUTPUT_BASEPATH] [-db DATASET_BASEPATH]
                    [-ds {small,large}] [-ts TRAINING_SET]
                    [-vs VALIDATION_SETS [VALIDATION_SETS ...]]
                    [-it {depth,rgb,depth_and_rgb}] [-iw INPUT_WIDTH]
                    [-ih INPUT_HEIGHT] [-ip {standardize,scale01,none}]
                    [-ot {regression,classification,biternion}]
                    [-lr LEARNING_RATE] [-lrd {poly}] [-m MOMENTUM] [-ne N_EPOCHS]
                    [-es EARLY_STOPPING] [-b BATCH_SIZE]
                    [-vb VALIDATION_BATCH_SIZE] [-nc N_CLASSES] [-k KAPPA]
                    [-opt {sgd,adam,rmsprop}] [-naug] [-rid RUN_ID]
                    [--mobilenet_v2_alpha {0.35,0.5,0.75,1.0}] [-d DEVICES] [-v]
                    {beyer,beyer_mod,beyer_mod_relu,mobilenet_v2,beyer_mod_relu_sep}
    
    Train neural network for orientation estimation
    
    positional arguments:
    {beyer,beyer_mod_relu,mobilenet_v2}
                          Model to use: beyer, beyer_mod_relu or mobilenet_v2
    optional arguments:
    -h, --help            show this help message and exit
    -o OUTPUT_BASEPATH, --output_basepath OUTPUT_BASEPATH
                            Path where to store output files, default: '/results/rotator'
    -db DATASET_BASEPATH, --dataset_basepath DATASET_BASEPATH
                            Path to downloaded dataset (default: '/datasets/rotator')
    -ds {small,large}, --dataset_size {small,large}
                            Dataset image size to use. One of :('small', 'large'), default: small
    -ts TRAINING_SET, --training_set TRAINING_SET
                            Set to use for training, default: training
    -vs VALIDATION_SETS [VALIDATION_SETS ...], --validation_sets VALIDATION_SETS [VALIDATION_SETS ...]
                            Sets to use for validation, default: [validation, test]
    -it {depth,rgb,depth_and_rgb}, --input_type {depth,rgb,depth_and_rgb}
                            Input type. One of ('depth', 'rgb', 'depth_and_rgb'), default: depth
    -iw INPUT_WIDTH, --input_width INPUT_WIDTH
                            Patch width to use, default: 96
    -ih INPUT_HEIGHT, --input_height INPUT_HEIGHT
                            Patch height to use, default: 96
    -ip {standardize,scale01,none}, --input_preprocessing {standardize,scale01,none}
                            Preprocessing to apply. One of [standardize, scale01, none], default: standardize
    -ot {regression,classification,biternion}, --output_type {regression,classification,biternion}
                            Output type. One of ('regression', 'classification', 'biternion'), default: biternion)
    -lr LEARNING_RATE, --learning_rate LEARNING_RATE
                            (Base) learning rate, default: 0.01
    -lrd {poly}, --learning_rate_decay {poly}
                            Learning rate decay to use, default: poly
    -m MOMENTUM, --momentum MOMENTUM
                            Momentum to use, default: 0.9
    -ne N_EPOCHS, --n_epochs N_EPOCHS
                            Number of epochs to train, default: 800
    -es EARLY_STOPPING, --early_stopping EARLY_STOPPING
                            Number of epochs with no improvement after which training will be stopped, default: 100.To disable early stopping use -1.
    -b BATCH_SIZE, --batch_size BATCH_SIZE
                            Batch size to use, default: 128
    -vb VALIDATION_BATCH_SIZE, --validation_batch_size VALIDATION_BATCH_SIZE
                            Batch size to use for validation, default: 512
    -nc N_CLASSES, --n_classes N_CLASSES
                            Number of classes when output_type is classification, default: 8
    -k KAPPA, --kappa KAPPA
                            Kappa to use when output_type is biternion or regression, default: biternion: 1.0, regression: 0.5
    -opt {sgd,adam,rmsprop}, --optimizer {sgd,adam,rmsprop}
                            Optimizer to use, default: sgd
    -naug, --no_augmentation
                            Disable augmentation
    -rid RUN_ID, --run_id RUN_ID
                            Run ID (default: 0)
    -ma {0.35,0.5,0.75,1.0}, --mobilenet_v2_alpha {0.35,0.5,0.75,1.0}
                            Alpha value for MobileNet v2 (default: 1.0)
    -d DEVICES, --devices DEVICES
                            GPU device id(s) to train on. (default: 0)
    -v, --verbose         Enable verbose output
  4. Evaluate trained networks

    # this creates a json file containing a deeper analysis of the trained networks
    python eval.py \
        biternion \
        --dataset_basepath /path/to/nicr_rgb_d_orientation_data_set \
        --set test \
        --output_path /path/where/to/store/evaluation/output/files \
        --training_basepath /path/where/to/store/training/output/files

    For further details and parameters, see:

    python eval.py --help
    usage: eval.py [-h] [-o OUTPUT_PATH] [-tb TRAINING_BASEPATH]
                [-s {validation,test}] [-ss {training,validation,test}]
                [-db DATASET_BASEPATH] [-ds {small,large}] [-v]
                {regression,classification,biternion}
    
    Evaluate trained neural networks for orientation estimation
    
    positional arguments:
    {regression,classification,biternion}
                            Output type. One of ('regression', 'classification', 'biternion') (default: biternion)
    
    optional arguments:
    -h, --help            show this help message and exit
    -o OUTPUT_PATH, --output_path OUTPUT_PATH
                            Path where to store created output files, default: '../eval_outputs/' relative to the location of this script
    -tb TRAINING_BASEPATH, --training_basepath TRAINING_BASEPATH
                            Path to training outputs (default: '/results/rotator')
    -s {validation,test}, --set {validation,test}
                            Set to use for evaluation, default: test
    -ss {training,validation,test}, --selection_set {training,validation,test}
                            Set to use for deriving the best epoch, default: validation
    -db DATASET_BASEPATH, --dataset_basepath DATASET_BASEPATH
                            Path to downloaded dataset (default: '/datasets/rotator')
    -ds {small,large}, --dataset_size {small,large}
                            Dataset image size to use. One of :('small', 'large'), default: small
    -v, --verbose         Enable verbose output

License and Citations

The source code is published under BSD 3-Clause license, see license file for details.

If you use the source code or the network weights, please cite the following paper:

Lewandowski, B., Seichter, D., Wengefeld, T., Pfennig, L., Drumm, H., Gross, H.-M. Deep Orientation: Fast and Robust Upper Body Orientation Estimation for Mobile Robotic Applications. in: IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), Macau, pp. 441-448, IEEE 2019

@InProceedings{Lewandowski-IROS-2019,
  author    = {Lewandowski, Benjamin and Seichter, Daniel and Wengefeld, Tim and Pfennig, Lennard and Drumm, Helge and Gross, Horst-Michael},
  title     = {Deep Orientation: Fast and Robust Upper Body Orientation Estimation for Mobile Robotic Applications},
  booktitle = {IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), Macau},
  year      = {2019},
  pages     = {441--448},
  publisher = {IEEE},
}