Updates

[2021.09.17] Code for flying guide dog prototype and the Pedestrian and Vehicle Traffic Lights (PVTL) dataset are released.

Flying Guide Dog

Official implementation of paper "Flying Guide Dog: Walkable Path Discovery for the Visually Impaired Utilizing Drones and Transformer-based Semantic Segmentation".

Overview

config/: Config
- experiment/: Config yaml files for different experiments
- default.py: Default config
drone/: Drone initialization and control
models/: Deep Learning models
- segmentation/: Segmentation models
- traffic_light_classification/: Traffic light classification models
utils/: Helper functions and scripts

Drone

The drone used in this project is DJI Tello.

Requirements

Python 3.7 or later with all requirements.txt dependencies installed, including torch>=1.7.

To install run:

pip install -r requirements.txt

SegFormer

Install mmcv-full

To use SegFormer, you need to install mmcv-full==1.2.7. For example, to install mmcv-full==1.2.7 with CUDA 11 and PyTorch 1.7.0, use the following command:
```
pip install mmcv-full==1.2.7 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.0/index.html
```
To install mmcv-full with different version of PyTorch and CUDA, please see: MMCV Installation.
Use submodule SegFormer
- Initialize the submodule(s):
```
git submodule init
```
- Run the update to pull down the files:
```
git submodule update
```

Install the dependencies of SegFormer:

pip install -e models/segmentation/SegFormer/ --user

Copy config file to SegFormer/

cp models/segmentation/segformer.b0.768x768.mapillary.160k.py models/segmentation/SegFormer/local_configs/segformer/B0

Models

Two types of models are used: street view semantic segmentation and traffic lights classification.

Street view semantic segmentation

We adopt SegFormer-B0 (trained on Mapillary Vistas for 160K iterations) for street-view semantic segmentation based on each frame captured by the drone.

Traffic lights classification

We create a custom traffic lights dataset named Pedestrian and Vehicle Traffic Lights (PVTL) Dataset using traffic lights images cropped from Cityscapes, Mapillary Vistas, and PedestrianLights. The PVTL dataset can be downloaded from Google Drive.

It containes 5 classes: Others, Pedestrian-red, Pedestrian-green, Vehicle-red, and Vehicle-green. Each class contains about 300 images. Train-validation split is 3:1.

We train 2 models on this dataset:

ResNet-18: We fine-tune ResNet-18 from torchvision.models. After 25 epochs training, the accuracy achieves around 90%.
Simple CNN model: We build our custom simple CNN model (5 CONV + 3 FC). After 25 epochs training, the accuracy achieves around 83%.

Trained weights

Create weights folder and its subfolder segmentation and traffic_light_classification
```
mkdir -p weights/segmentation weights/traffic_light_classification
```
Download trained weights from Google Drive and put them into corresponding folders

Usage

Choose a config file in config/experiment/, e.g. config/experiment/segformer-b0_720x480_fp32.yaml. You can also create your custom config file by adjusting the default config.

Run

python main.py --cfg <config_file>

Model specified in the config file will be loaded.

For example:

python main.py --cfg config/experiment/segformer-b0_720x480_fp32.yaml

Turn on DJI Tello. Connect to drone's wifi.
Run
```
python main.py --cfg <config_file> --ready
```
For example:
```
python main.py --cfg config/experiment/segformer-b0_720x480_fp32.yaml --ready
```
After initialization for a few seconds, an OpenCV window will pop up. Then press T to take off. During flying, the drone will keep discovering walkable areas and try to keep itself in the middle as well as follow along the walkable path. When pedestrian traffic light occurs in drone's FOV, it will react based on the classification prediction of pedestrian traffic light signal.

Another keyboard controls:
- L: Land temporarily. You can press T to take off again.
- Q: Land and exit.
- Esc: Emergency stop. All motors will stop immediately.

Demo Video

Citation

@article{tan2021flying,
  title={Flying Guide Dog: Walkable Path Discovery for the Visually Impaired Utilizing Drones and Transformer-based Semantic Segmentation},
  author={Tan, Haobin and Chen, Chang and Luo, Xinyu and Zhang, Jiaming and Seibold, Constantin and Yang, Kailun and Stiefelhagen, Rainer},
  journal={arXiv preprint arXiv:2108.07007},
  year={2021}
}

Acknowledgements

Great thanks for these open-source repositories:

DJI Tello drone python interface: DJITelloPy
Semantic segmentation: fastseg, DS-PASS, SegFormer

EckoTan0804/flying-guide-dog