/drone-gesture-control

🚁 Robust proof-of-concept of a gesture-controlled drone: augmenting an ArduPilot flight controller with a Jetson Nano!

Primary LanguagePythonMIT LicenseMIT

Vision-Based Gesture-Controlled Drone

MIT License LinkedIn

Banner

This project leverages the Jetson Nano's computational power to augment a drone with computer vision capabilities and allow gesture control. The deep learning model deployed here is part of a larger project, a Pose Classification Kit, focusing on pose estimation/classification applications toward new human-machine interfaces.

Demonstration & Processing pipeline description

Demonstration video

pipeline

Getting Started

Step 1 - Install Dependencies

  1. Install PyTorch and Torchvision - see PyTorch for Jetson Nano.
  2. Install TensorFlow - see Installing TensorFlow For Jetson Platform. Note that TensorFlow is already installed on JetPack.
  3. Install torch2trt
    git clone https://github.com/NVIDIA-AI-IOT/torch2trt
    cd torch2trt
    sudo python3 setup.py install --plugins
    
  4. Install other miscellaneous packages
    sudo pip3 install numpy Pillow pymavlink dronekit 
    

Step 2 - Install the gesture control pipeline

  1. Clone this repository on your Jetson Nano
    git clone https://github.com/ArthurFDLR/drone-gesture-control
    cd drone-gesture-control
    
  2. Download and place the TensorRT-Pose pre-trained model resnet18_baseline_att_224x224_A_epoch_249.pth in the folder .\drone-gesture-control\models
  3. Run the installation procedure. This operation can take a while.
    sudo python3 install.py
    

Step 3 - Hardware setup

  1. Wire the UART ports D15 (RX) - D14 (TX) on the J41 expansion header pins of the carrier board of the Jetson Nano to a MAVLink enabled serial port on your flight controller. See bellow a setup example using the Pixhawk 4 flight controller. The default baud rate is 57600.

Banner

  1. Disable the Serial Console trigger on the serial connection - see Jetson Nano – UART.

    systemctl stop nvgetty
    systemctl disable nvgetty
    udevadm trigger
    
  2. Connect your camera to the CSI port 0. You might have to adapt the GStreamer pipeline for your camera - see gstream_pipeline in .\drone-gesture-control\__main__.py. The camera used for development is an Arducam IMX477 mini.

Step 4 - Launch the system

python3 drone-gesture-control

Usage

The gesture control system currently supports only basics - yet essential - commands:

  • T: Arm the drone if it is disarmed and landed; Disarm the drone if it is armed and landed.
  • Traffic_AllStop: Take-off at an altitude of 1.8m if the drone is armed and landed; Land if the drone is in flight.
  • Traffic_RightTurn: Move 4m to the right if the drone is armed.
  • Traffic_LeftTurn: Move 4m to the left if the drone is armed.
  • Traffic_BackFrontStop: Move 2m backward if the drone is armed.
  • Traffic_FrontStop: Move 2m forward if the drone is armed.
  • Yoga_UpwardSalute: Return to Launch (RTL).

Body classes

For security purposes, the system only transmits orders to the flight controller if it is in GUIDED mode. We recommend binding a switch of your radio controller to select this mode for ease of use.

System performance

The classification model used in this project is the best performing of the Pose Classification Kit (PCK). This model yields great results both in terms of accuracy and inference time. During flights, it is pretty common for the embedded camera only to record a person's upper body. The system thus has to be highly reliable even on partial inputs. The model is tested on two datasets to ensure this property: the original PCK dataset and the same samples with missing keypoints. The testing accuracies on these datasets respectively reach 98.3% and 95.1%. As shown in the confusion matrices bellow (left: original dataset - right: partial inputs), even poses that are hardly distinguishable by humans (only looking at the upper-body) are almost perfectly classified by the model. After optimization (see .\install.py), the whole processing pipeline - from image capture to drone control - consistently run at speed from 9.5Hz to 13Hz.

Confusion matrices

Additional resources

Citation

@inproceedings{9813802,
    author    = {Findelair, Arthur and Yu, Xinrui and Saniie, Jafar},
    booktitle = {2022 IEEE International Conference on Electro Information Technology (eIT)},
    title     = {Design Flow and Implementation of a Vision-Based Gesture-Controlled Drone},
    year      = {2022},
    pages     = {320-324},
    doi       = {10.1109/eIT53891.2022.9813802}
}

License

Distributed under the MIT License. See LICENSE for more information.

Acknowledgments

Many thanks to the ECASP Laboratory from the Illinois Institute of Technology that has provided all the necessary hardware to develop this project.