/deepEye

A POC project for OpenCV Spatial AI Competition

Primary LanguagePythonMIT LicenseMIT

deepEye - The third eye for Visually Impaired People

OpenCV announced its first Spatial AI Competition sponsored by Intel. As we know, OpenCV is a famous open-source computer vision library. They called for participants to solve real-world problems by using OAK-D (OpenCV AI Kit with Depth) module. The OAK-D module has built-in Stereo cameras along with an RGB camera. It also has powerful visual processing unit (Myriad X from Intel) to enable deep neural network inferences on board.

We decided to submit a project proposal for this competition back in July. Our group’s proposal was selected (among 32 out of 235).

So, we propose to build an advanced assist system for the Visually Impaired People to perceive the environment in a better way and would provide seamless, reliable navigation for them at a low cost so that anyone can leverage the benefits of computer vision.

Demo Videos

πŸ‘‰ deepEye Demo
deepEye Demo

Table of content

🎬 Software High Level Design

HLD

πŸ—ƒ Project structure

.
β”œβ”€β”€ android                      
β”‚   β”œβ”€β”€ apk                                 # Android APK File       
β”‚   β”‚   └── app-debug.apk
β”‚   └── startup_linux
β”‚       β”œβ”€β”€ deepeye.sh                      # deepeye startup script to enable RFCOMM
β”‚       └── rfcomm.service                  # systemd service for RFCOMM
|
β”œβ”€β”€ custom_model
β”‚   └── OI_Dataset                          # Mobile Net SSD V2 Custom training on OpenImage Dataset V4
β”‚       β”œβ”€β”€ README.md
β”‚       β”œβ”€β”€ requirements.txt
β”‚       β”œβ”€β”€ scripts
β”‚       β”‚   β”œβ”€β”€ csv2tfrecord.py             # Tensorflow: CSV to TFrecord Converter
β”‚       β”‚   β”œβ”€β”€ txt2xml.py                  # Tensorflow: TXT to XML Converter
β”‚       β”‚   └── xml2csv.py                  # Tensorflow: XML to CSV Converter
β”‚       └── tf_test.py                      # Test script for Trained model inference
|
β”œβ”€β”€ deepeye_app                             # Deepeye core application
β”‚   β”œβ”€β”€ app.py                              # Object detection and post processing
β”‚   β”œβ”€β”€ calibration                         # Camera Callibration
β”‚   β”‚   └── config
β”‚   β”‚       └── BW1098FFC.json
β”‚   β”œβ”€β”€ collision_avoidance.py              # Collision calculation
β”‚   β”œβ”€β”€ config.py
β”‚   β”œβ”€β”€ models                              # Mobilenet-ssd v2 trained model
β”‚   β”‚   β”œβ”€β”€ mobilenet-ssd.blob
β”‚   β”‚   └── mobilenet-ssd_depth.json
β”‚   β”œβ”€β”€ tracker.py                          # Object tracker
β”‚   └── txt2speech                          # txt2speech model
β”‚       β”œβ”€β”€ README.md
β”‚       β”œβ”€β”€ txt2speech.py
β”‚       └── txt-simulator.py
β”œβ”€β”€ images
β”œβ”€β”€ openvino_analysis                       # CNN model fom Intel and Opensouce ACC, FPS analysis
β”‚   β”œβ”€β”€ intel
β”‚   β”‚   β”œβ”€β”€ object-detection
β”‚   β”‚   └── semantic-segmentation
β”‚   β”œβ”€β”€ public
β”‚   β”‚   β”œβ”€β”€ ssd_mobilenet_v2_coco
β”‚   β”‚   └── yolo-v3
β”‚   └── README.md
β”œβ”€β”€ README.md                              # Deepeye README
β”œβ”€β”€ requirements.txt                  
└── scripts                                # OpenVino Toolkit scripts
    β”œβ”€β”€ inference_engine_native_myriad.sh  
    β”œβ”€β”€ model_intel.sh
    └── rpi_openvino_install-2020_1.sh

πŸ’» Hardware pre-requisite

πŸ“¦ Software pre-requisite

For Jetson: Flash Jetson board to jetpack 4.4 ⚑️

microSD card Prepration:

  1. Download Jetson Nano Developer Kit SD Card image Jetpack4.4 Image.
  2. Use etcher to burn a image.

CUDA Env PATH :

if ! grep 'cuda/bin' ${HOME}/.bashrc > /dev/null ; then
  echo "** Add CUDA stuffs into ~/.bashrc"
  echo >> ${HOME}/.bashrc
  echo "export PATH=/usr/local/cuda/bin:\${PATH}" >> ${HOME}/.bashrc
  echo "export LD_LIBRARY_PATH=/usr/local/cuda/lib64:\${LD_LIBRARY_PATH}" >> ${HOME}/.bashrc
fi
source ${HOME}/.bashrc

System dependencies :

sudo apt-get update
sudo apt-get install -y build-essential make cmake cmake-curses-gui
sudo apt-get install -y git g++ pkg-config curl libfreetype6-dev
sudo apt-get install -y libcanberra-gtk-module libcanberra-gtk3-module
sudo apt-get install -y python3-dev python3-testresources python3-pip
sudo pip3 install -U pip

Performance Improvements:

To set Jetson Nano to 10W performance mode (reference), execute the following from a terminal:

sudo nvpmodel -m 0
sudo jetson_clocks

Enable swap:

sudo fallocate -l 8G /mnt/8GB.swap
sudo mkswap /mnt/8GB.swap
sudo swapon /mnt/8GB.swap
if ! grep swap /etc/fstab > /dev/null; then \
    echo "/mnt/8GB.swap  none  swap  sw  0  0" | sudo tee -a /etc/fstab; \
fi

jetson performance analysis:

pip3 install jetson-stats

Recompile a Jetson Linux kernel - Support RFCOMM TTY Support:

We are using RFCOMM Serial protocol for Jetson-Android communication and the defauly kernel doesn't have a support for RFCOMM TTY. So, We have to recompile with new kernel config and update.

# Basic Update
sudo apt-get update
sudo apt-get install -y libncurses5-dev

# Downlaod Linux L4T(BSP) Source code from Nvidia Downlaod center
wget https://developer.nvidia.com/embedded/L4T/r32_Release_v4.3/Sources/T210/public_sources.tbz2

tar -xvf public_sources.tbz2

cp Linux_for_Tegra/source/public/kernel_src.tbz2 ~/

pushd ~/

tar -xvf kernel_src.tbz2

pushd ~/kernel/kernel-4.9

zcat /proc/config.gz > .config

# Enable RFCOMM TTY
make menuconfig # Networking Support --> Bluetooth subsystem support ---> Select RFCOMM TTY Support ---> Save ---> Exit

make prepare

make modules_prepare

# Compile kernel as an image file
make -j5 Image

# Compile all kernel modules
make -j5 modules

# Install modules and kernel image
cd ~/kernel/kernel-4.9
sudo make modules_install
sudo cp arch/arm64/boot/Image /boot/Image

# Reboot 
sudo reboot

Depth AI Python Interface Install

# Install dep
curl -fL http://docs.luxonis.com/install_dependencies.sh | bash
sudo apt install libusb-1.0-0-dev

# USB Udev 
echo 'SUBSYSTEM=="usb", ATTRS{idVendor}=="03e7", MODE="0666"' | sudo tee /etc/udev/rules.d/80-movidius.rules
sudo udevadm control --reload-rules && sudo udevadm trigger

git clone https://github.com/luxonis/depthai-python.git
cd depthai-python
git submodule update --init --recursive
mkdir -p ~/depthai_v1
python3 -m venv ~/depthai_v1
python3 -m pip install -U pip
python3 setup.py develop

# Check the Installation
python3 -c "import depthai"

# Install opencv
cd scripts
bash opencv.sh
cd ..

Camera Calibration

mkdir -p ~/depthai/ && pushd ~/depthai/
git clone https://github.com/luxonis/depthai.git
popd
cp calibration/config/BW1098FFC.json depthAI/depthai/resources/boards/
pushd ~/depthai/
python3 calibrate.py -s 2 -brd BW1098FFC -ih

Robotic Operating System

We use ROS framework multiprocess communication.

sudo sh -c 'echo "deb http://packages.ros.org/ros/ubuntu $(lsb_release -sc) main" > /etc/apt/sources.list.d/ros-latest.list'
sudo apt-key adv --keyserver 'hkp://keyserver.ubuntu.com:80' --recv-key C1CF6E31E6BADE8868B172B4F42ED6FBAB17C654

sudo apt update

sudo apt install -y ros-melodic-ros-base

# Env setup
echo "source /opt/ros/melodic/setup.bash" >> ~/.bashrc
source ~/.bashrc

# Dep to build ROS Package
sudo apt install python-rosdep python-rosinstall python-rosinstall-generator python-wstool build-essential

# Install inside virtual env
sudo apt install python-rosdep
rosdep init

rosdep update

Android RFCOMM Setup

We need to configure rfcomm service in order to use the Android application for text to speech feature.

sudo cp android/startup_linux/deepeye.sh /usr/bin/
sudo chmod a+x /usr/bin/deepeye.sh

sudo cp android/startup_linux/rfcomm.service /etc/systemd/system/
sudo systemctl enable rfcomm

Other Dependency

python3 -m pip install -r requirements.txt

# SOX for txt 2 speech
sudo apt-get install sox libsox-fmt-mp3

πŸ–– Quick Start

# Terminal one
# ROS Master
roscore &

# Terminal Two
# Deepeye core app
pushd deepeye_app
python3 app.py
popd

# Terminal three
# Txt2speech middleware component
pushd deepeye_app
python3 txt2speech/txt2speech.py
popd

πŸŽ› Advanced uses

Custom Object Detector

We have retrained an SSD MobileNet SSD-V2 with Open Image dataset. We picked up and trained all the object classes that help visually impaired people to navigate when they go to outdoor environments.

We have added README for the end to end training and the OpenVino Conversion before loading to depth AI.

πŸ›  Hardware Details

We plan use the DepthAI USB3 Modular Cameras[BW1098FFC] for POC. We are using RPI and Jeston. The AI/vision processing is done on the depthAI based on Myriad X Arch.

depthAI

Key Features of the device:

  • 2 BG0250TG mono camera module interfaces
  • 1 BG0249 RGB camera module interface
  • 5V power input via barrel jack
  • USB 3.1 Gen 1 Type-C
  • Pads for DepthAI SoM 1.8V SPI
  • Pads for DepthAI SoM 3.3V SDIO
  • Pads for DepthAI SoM 1.8V Aux Signals (I2C, UART, GPIO)
  • 5V Fan/Aux header
  • Pads for DepthAI SoM aux signals
  • Design files produced with Altium Designer 20

πŸ’Œ Acknowledgments

DepthaAI Home Page
depthaAI core development
OpenVino toolkit development
BW1098FFC_DepthAI_USB3 HW
OIDv4 ToolKit