We propose a pipeline to select the pre-shape of the Hannes prosthesis using visual input. We collected a real dataset and used it to train and test our models. The test sets are organized into 5 different sets of increasing complexity. Each test set represents a different condition that doesn't appear in the real training set.
Our main contribution is a synthetic data generation pipeline designed for vision-based prosthetic grasping. We compare a model trained on real data with the same model trained on the proposed synthetic data. As shown in the table below, the synthetically-trained model achieves comparable average value and better standard deviation, proving our method robustness.
Our work is accepted to IROS 2022
Test set | Real training Video acc. (%) |
Synthetic training Video acc. (%) |
---|---|---|
Same person | 98.9 ± 0.8 | 80.2 ± 0.9 |
Different velocity | 81.7 ± 0.9 | 79.7 ± 0.8 |
From ground | 76.2 ± 1.0 | 76.0 ± 0.9 |
Seated | 63.9 ± 1.0 | 68.1 ± 1.0 |
Different background | 56.2 ± 1.7 | 76.4 ± 2.0 |
Average over test sets | 75.4 ± 14.8 | 76.1 ± 4.3 |
This repository contains the PyTorch code to reproduce the results presented in our work.
The code is developed with Python 3.8 - PyTorch 1.10.1 - CUDA 10.2.
Clone project and install dependencies:
# clone project
git clone https://github.com/hsp-iit/prosthetic-grasping-experiments
# create virtual environment and install dependencies
cd prosthetic-grasping-experiments
python3.8 -m venv pge-venv
source pge-venv/bin/activate
pip install -r requirements.txt
# in the install above, torch installation may fail. Try with the command below:
pip install torch==1.10.1+cu102 torchvision==0.11.2+cu102 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu102/torch_stable.html
# ..or go through the website: https://pytorch.org/get-started/previous-versions/
We provide a script to automatically download the real and synthetic datasets (specify your preferred folder path with --out_dataset_folder
, in the example below it saves in the datasets
folder located in the parent folder of this repository). The script also arrange the datasets according to the format required by our dataloaders.
python download_dataset.py --out_dataset_folder ../datasets --remove_zips
The datasets
folder contains all the datasets: both the real (i.e. iHannesDataset
) and the synthetic (i.e. ycb_synthetic_dataset
) dataset. For each dataset, both the frames and the pre-extracted features using mobilenet_v2 (pre-trained on ImageNet) are available.
The datasets
folder has the following macro-structure (i.e., path to the specific dataset folder):
datasets/
├── real/
| ├── frames/
| | ├── iHannesDataset
| |
│ ├── features/
| ├── mobilenet_v2/
| ├── iHannesDataset
|
├── synthetic/
├── frames/
| ├── ycb_synthetic_dataset
|
├── features/
├── mobilenet_v2/
├── ycb_synthetic_dataset
Each dataset (i.e., iHannesDataset
, ycb_synthetic_dataset
) has the following path to the frames/features:
DATASET_BASE_FOLDER/CATEGORY_NAME/OBJECT_NAME/PRESHAPE_NAME/Wrist_d435/rgb*
If you want to use our dataloaders, make sure that the above arrangement (both the macro-structure and the path to frames/features) is maintained.
Create softlinks:
cd prosthetic-grasping-experiments/data
ln -s /YOUR_PATH_TO_DATASETS_FOLDER/real
ln -s /YOUR_PATH_TO_DATASETS_FOLDER/synthetic
and the resulting structure is:
prosthetic-grasping-experiments/
├── data/
├── real/
| ├── ...
|
├── synthetic/
├── ...
Pre-extracted features are already provided by downloading the datasets above. However, to extract features on your own, you can use:
cd prosthetic-grasping-experiments
python src/tools/cnn/extract_features.py \
--batch_size 1 --source Wrist_d435 \
--input rgb --model cnn --dataset_type SingleSourceImage \
--feature_extractor mobilenet_v2 --pretrain imagenet \
--dataset_name iHannesDataset
For each video, a features.npy
file is generated. The file has shape (num_frames_in_video, feature_vector_dim)
and will be located according to the path defined above.
All runnable files are located under the src/tools
folder. At the beginning of each file you can find some run command examples, with different arguments.
When the training starts, a folder is created at the prosthetic-grasping-experiments/runs
path (you can specify the folder name with --log_dir
argument). This folder is used to store the measures and the best model checkpoint.
Example 1: train the fully-connected classifier of mobilenet_v2 on the real dataset, starting from pre-extracted features:
cd prosthetic-grasping-experiments
python src/tools/cnn/train.py --epochs 5 \
--batch_size 32 --source Wrist_d435 --dataset_type SingleSourceImage \
--split random --input rgb --output preshape --model cnn \
--feature_extractor mobilenet_v2 --pretrain imagenet --freeze_all_conv_layers \
--from_features --dataset_name iHannesDataset \
--log_dir train_from_features
Example 2: same as above, but training on synthetic data (remember to add the --synthetic
argument, otherwise a wrong path to the dataset is constructed):
cd prosthetic-grasping-experiments
python src/tools/cnn/train.py --epochs 5 \
--batch_size 64 --source Wrist_d435 --dataset_type SingleSourceImage \
--split random --input rgb --output preshape --model cnn \
--feature_extractor mobilenet_v2 --pretrain imagenet --freeze_all_conv_layers \
--from_features --dataset_name ycb_synthetic_dataset --synthetic
Example 3: train the LSTM on the real dataset, starting from pre-extracted features:
cd prosthetic-grasping-experiments
python src/tools/cnn_rnn/train.py --epochs 10 \
--batch_size 32 --source Wrist_d435 --dataset_type SingleSourceVideo \
--split random --input rgb --output preshape --model cnn_rnn --rnn_type lstm \
--rnn_hidden_size 256 --feature_extractor mobilenet_v2 --pretrain imagenet \
--freeze_all_conv_layers --from_features --dataset_name iHannesDataset
Example 4: fine-tune the whole network (i.e., use RGB frames instead of pre-extracted features) starting from the ImageNet weights:
cd prosthetic-grasping-experiments
python src/tools/cnn/train.py --epochs 10 \
--batch_size 64 --source Wrist_d435 --dataset_type SingleSourceImage \
--split random --input rgb --output preshape --model cnn \
--feature_extractor mobilenet_v2 --pretrain imagenet \
--lr 0.0001 --dataset_name ycb_synthetic_dataset --synthetic
To test a model, copy and paste its running command used for training and substitute the train.py
script with eval.py
. Moreover, you have to specify the path to the model checkpoint with --checkpoint
argument and the test set with --test_type
argument.
Example 1: test the model on the Same person test set:
cd prosthetic-grasping-experiments
python src/tools/cnn/eval.py --epochs 5 \
--batch_size 32 --source Wrist_d435 --dataset_type SingleSourceImage \
--split random --input rgb --output preshape --model cnn \
--feature_extractor mobilenet_v2 --pretrain imagenet --freeze_all_conv_layers \
--from_features --dataset_name iHannesDataset \
--log_dir train_from_features \
--checkpoint runs/train_from_features/best_model.pth --test_type test_same_person
Some confusion matrices will be displayed on screen, you can simply close them and visualize later on tensorboard. Many different metrics, both at per-frame and video granularity, are printed on the shell. In our work, the results are presented as video accuracy (obtained from per-frame predictions through majority voting, excluding the background class). This value is printed on the shell as follows:
.
.
.
=== VIDEO METRICS ===
ACCURACY W BACKGR: xx.xx%
ACCURACY W/O BACKGR: xx.xx% <==
.
.
.
You can visualize both the training and evaluation metrics on tensorboard with:
cd prosthetic-grasping-experiments
tensorboard --logdir runs/train_from_features
@inproceedings{vasile2022,
author = {F. Vasile and E. Maiettini and G. Pasquale and A. Florio and N. Boccardo and L. Natale},
booktitle = {2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
title = {Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared Control on the Hannes Prosthesis},
year = {2022},
month = {Oct},
}
This repository is mantained by:
@FedericoVasile1 |
- For further details about our synthetic data generation pipeline, please refer to our paper (specifically SEC. IV) and feel free to contact me: federico.vasile@iit.it
- A demonstration video of our model trained on the synthetic data and tested on the Hannes prosthesis is available here
- A presentation video summarizing our work is available here