Hand gesture detector for multiple objects with SVM and HOG

About

The project presents a full solution for capturing, retraining and classifieing images with SVM and HOG features. Currently the tranined file can detect and classify multiple hand gestures, pictures below. The last picture shows the scenario where detections are overlapping, in this case the classification is not possible (if like in this case) no similar scenario was trained previously.

The flow of the detection and classifications, starts with low level filtering (skin color detector), followed by a contour selector. Role of the contour selector is to identify the regions of interest (ROI) where the object for classification can be found. Image patches like below are collected and stored in directories for a database creation. Images are normalized before saveings, to obtain better detection performance.

After creating a sufficiently big database ~100 pictures/class, the SVM training can start. It Will take only a few seconds and an xml file will be saved which holds everything needed for further detections.

Usage

From Command line:

usage: gesture_detector.py [-h] -l labelfile [-t trainedfile] [-cmd commands]
                           [-d detectionswindow] [-c cameraid] [-s samplesize]
                           [-dir traindirectory] [-dd detectiondir]
                           [-ani annotatedimages] [-sf saveframe]
                           [-sp savepatches] [-cal calibration]

optional arguments:
  -h, --help            show this help message and exit
  -l labelfile          label file, like ./label.txt
  -t trainedfile        file which holds the trained values like, ./data.xml
  -cmd commands         commands available for processing: [retrain,
                        savedetections, creatlabelfile, run], default=run
  -d detectionswindow   show/ hide detection window [0,1], default=0
  -c cameraid           use different camera ID, default=0
  -s samplesize         select the training sample size, default=96
  -dir traindirectory   directory with ordered pictures per detection classes
  -dd detectiondir      directory to save the detections or image patches for
                        training set creation
  -ani annotatedimages  save annotated images, [0,1] default=0
  -sf saveframe         save original camera frames, [0,1] default=0
  -sp savepatches       save image detections (patches), [0,1] default=0
  -cal calibration      reclalibrate skin color detection, [0,1] default=0

Retrain

1. Create a directory structure to store the files
2. Create labels file
3. Capture and copy files under the desired folders
4. Instruct software for retraining

1. Create a directory structure

Is recommended to start from the root of the python file, and create a directory like "data", then create under these folder directories equivalent to the classes that are needed for the classification part. Similar to below.

├── data
│   ├── 0
│   ├── 1
│   ├── 2
│   ├── _2
│   ├── 3
│   ├── 4
│   ├── 5
│   ├── face
│   ├── halt
│   ├── ok
│   ├── tmp

2. Create labels file

The tool itself provides means for saving image regions, this is fitting to the needs. First, we need the label file, then the rest is easy.

python3 ./gesture_detector.py -l ./labels.txt -cmd creatlabelfile -dir ./data

Now, label file was created, contect is somethng like this:

#1,_2,/home/test/data/_2
2,5,/home/test/data/5
3,face,/home/test/data/face
4,2,/home/test/data/2
5,ok,/home/test/data/ok
#6,tmp,/home/test/data/tmp
7,4,/home/test/data/4
8,0,/home/test/data/0
9,1,/home/test/data/1
10,halt,/home/test/data/halt
11,3,/home/test/data/3

Unneeded directories like "tmp" - used to store temporary images, can be commented on, the label loader will skip it. The format is simple, the first element is the UID (unique identifier), label name - edit it to whatever wants to show on detections. Last is the directory for the specific label.

3. Capture and copy files under the desired folders

Use the following command to save images into a temp directory:

python3 ./gesture_detector.py -l ./labels.txt -t -dd ./data/tmp -sf 1 -sp 1

This will capture regions for detection and the frames on which the regions were detected. If just patches are needed use just -sp 1 command. Now, comes the hard work of selecting and copying the specific image patches under the desired directories.

4. Instruct software for retraining

Use the following command for retrain:

python3 ./gesture_detector.py -l ./labels.txt -t ./data.xml -cmd retrain

When everything has gone well with the re-train, the software starts automatically. The data.xml will be automatically saved, therefore for next run only:

python3 ./gesture_detector.py -l ./labels.txt -t ./data.xml

Tweaking the object detector

Detection performance is mainly driven by the color filterer. Different skin tones or even changing the object traced can be easily set, in the following code section:

# pre-processing
objPP = clPreProcessing(img0, False, 155, 30, 100)

There is a simple tool to obtain the desired color, try this out.

(Re) Calibrate

In some cases, a skin color detector does not provide consistent detection. Recalibration could be the solution. Values are stored in the labels file. To start the calibration process invokes the python file with the following arguments:

python3 ./gesture_detector.py -l ./labels.txt -t data.xml -cal 1

A small calibration window will appear, the object with the desired color to be detected shall be placed to cover the full window. On real time the HSV calibration values will be dispeayed, if the values are stabilized, press "c" button from the keyboard

If all went well, the new values are stored in the labels file, and the application will close automatically.

Future work

Implement validation function and detection accuracy for fine tunings.

Resources

Python OpenCV tutorial.

/Enjoy.

fvilmos/gesture_detector