- This is part of my summer research project at Monash University, supervised by Dr Akansel Cosgun.
- The entire project is executed on my Nvidia GeForce GTX1060 graphic card.
- A custom dataset is created in Monash University to train the multi-layer perceptron.
- A total of 25 videos are recorded in a lab setting, containing a total of 2506 images.
- This paper is accepted by AVHRC2020 workshop.
- Link to arXiv: https://arxiv.org/abs/2007.09945.
- detectron2 (Follow INSTALL.md to install detectron2)
- pytorch
- tensorflow
- mtcnn
- yacs
Download the pretrained models for object detection, head pose estimation and MLP. Place them in ./pretrained-weights
.
The system diagram is shown below for illustration purposes.
python3 main.py \
--cfg-keypoint ./configs/keypoint_rcnn_R_101_FPN_3x.yaml \
--cfg-object ./configs/object_faster_rcnn_R_101_FPN_3x.yaml \
--obj-weights ./pretrained-weights/Apple_Faster_RCNN_R_101_FPN_3x.pth \
--video-input [VIDEO_INPUT] \
--output [OUTPUT] \
--out-json [JSON FILE] \
--train
python3 utils/json_utils.py --json-path [JSON_FOLDER] --csv-path [classes.csv] \
--output-json post_processing.json
python3 training/train_MLP_localize.py --json-path [JSON_FILE] --weights-path [PATH_TO_WEIGHTS]
python3 main.py \
--cfg-keypoint ./configs/keypoint_rcnn_R_101_FPN_3x.yaml \
--cfg-object ./configs/object_faster_rcnn_R_101_FPN_3x.yaml \
--obj-weights ./pretrained-weights/Apple_Faster_RCNN_R_101_FPN_3x.pth \
--video-input [VIDEO_INPUT] \
--output [OUTPUT]
@article{kwan2020handover,
title={Gesture Recognition for Initiating Human-to-Robot Handovers},
author={Kwan, Jun and Tan, Chinkye and Cosgun, Akansel},
journal={arXiv preprint arXiv:2007.09945},
year={2020}
}