This project is based on PyTorch YOLOv3 software developed by Ultralytics LLC which has GPL-3.0 license.
I added explanations about how to train hand tracking and some patches for EgoHand dataset.
For more information, please visit https://www.ultralytics.com and https://github.com/ultralytics/yolov3.
$ git clone https://github.com/speardutch/yolov3_handtrack.git && cd yolov3_handtrack
$ conda create --name yolov3_handtrack python=3.7
$ conda activate yolov3_handtrack
(yolov3_handtrack)$ conda install numpy opencv matplotlib tqdm
(yolov3_handtrack)$ conda install pytorch torchvision -c pytorch
(yolov3_handtrack)$ pip install opencv-python
From now, we consider we are in (yolov3_handtrack) virtual environment.
Download pre-trainded weights and locate them to weights/
(I downloaded YOLO-spp weights and darknet53.conv.74)
- PyTorch
*.pt
format: https://drive.google.com/drive/folders/1uxgUBemJVw9wZsdpboYbzUN4bcRhsuAI Note if you don't download darknet53.conv.74 in weights folder, it will download automatically when train.
$ python detect.py --cfg cfg/yolov3-spp.cfg --weights weights/yolov3-spp.weights --webcam
$ git clone https://github.com/victordibia/handtracking.git && cd handtracking
$ conda install scipy
$ python egohands_dataset_clean.py # download and convert m file
$ git clone https://github.com/victordibia/handtracking.git && cd handtracking
$ conda install scipy
# Download EgoHand dataset, split them into train/test, convert label file m to csv, copy to images/ folder
$ python egohands_dataset_clean.py
After visualization of converting, check test_labels.csv and train_labels.csv file in images/test and inages/train folder with jpg files.
Copy images/ folder to your EgoHands dataset folder
$ mkdir [EgoHands dataset] && cp [handtracking project folder]/images/* [EgoHands dataset]/ -rf
$ cd yolov3_handtrack
$ python js_create_labels_from_csv.py --csv [EgoHands dataset]/test/test_labels.csv
399 files label created
$ python js_create_labels_from_csv.py --csv [EgoHands dataset]/train/train_labels.csv
4383 files label created
Note jpg files are 400 and 4400 files but there are files with no labels, so label files are lesser.
See visualization if labels are created well(ESC for quit).
$ python js_visualize_dataset.py --data [EgoHands dataset]/test
Go to [EgoHands dataset]
$ cd [EgoHands dataset]
create train.txt and test.txt, list of test and train files(jpg)
$ find test/*.jpg -type f | xargs realpath > test.txt
$ find train/*.jpg -type f | xargs realpath > train.txt
create class file, we only have one class, "hand"
$ echo "hand" > classes.txt
Go to project folder
$ cd yolov3_handtrack
Edit data cfg file, write down train.txt, test.txt, classes.txt file path
$ cd yolov3_handtrack
$ vi cfg/egohands-dataset.cfg
#
#classes=1 <== Do not Change
#train=[EgoHands dataset]/train.txt <== Change to your EgoHands Dataset folder
#valid=[EgoHands dataset]/test.txt <== Change to your EgoHands Dataset folder
#names=[EgoHands dataset]/classes.txt <== Change to your EgoHands Dataset folder
# ...
Now, train!
$ python train.py --cfg cfg/yolov3-spp-egohands.cfg --data-cfg cfg/egohands-dataset.cfg --batch-size 8
Note if you don't downloaded darknet53.conv.74 in weights folder, it will download automatically when train.
Change Hyper-parameters by option. I trained with above option with RTX 2070 GPU, 273 epochs 14 hours and mAP was 0.962
You can resume train with --resume
option, if terminated unfinished.
You can plot training status with below commands.
$ python
>> from utils import utils
>> utils.plot_results()
>> exit()
$ eog results.png
Results will be saved in weights/ folder, best.pt file is the best weights.
python detect.py --cfg cfg/yolov3-spp-egohands.cfg --data-cfg cfg/egohands-dataset.cfg --weights weights/best.pt --webcam
I got about 15~18ms inference time, approximately 50fps >, with RTX 2070 GPU.
-
PyTorch YOLO v3
https://www.ultralytics.com
https://github.com/ultralytics/yolov3 -
EgoHand dataset download and label creation
https://github.com/victordibia/handtracking
- Hyper parameters tuning for better result
- Use other dataset for more robust inferrence
- Please post issues if you found better way!!!