NVIDIA had recently written up instructions on how to train models on the Jetson Nano rather than in DIGITS on x86. Also, my daughter is President of the ASL club at her school.
Demo Video: https://youtu.be/z7-9uf8hzfg
- NVIDIA Jetson Nano or Xavier
- NVIDIA JetPack OS
- ASL Images for Training
- NVIDIA PyTorch Training Instructions
Follow the instructions on the JetPack page above
$ git clone https://github.com/loicmarie/sign-language-alphabet-recognizer.git
$ git clone https://github.com/dusty-nv/jetson-inference.git
(You should really read the entire ReadMe and build all the machine learning tools from source, but for this project, we just need to clone the repository.)
Use my split.py script for this. Just change line 7 to the parent directory of all the ASL data on your computer.
Or just use mine
$ mkdir asl_model
$ cd jetson-inference/python/training/classification
$ python3 train.py --model-dir=<PATH-TO-YOUR-MODEL-DIR> <PATH-TO-YOUR-DATASET>
(The training will run for 35 epochs as a default. After each epoch you should see the accuracy at
1(00%) confidence and at 5(0%) confidence increase.)
Epoch: [0] completed, elapsed time 1016.668 seconds (16 minutes. 16*30=480/60=8 hours. 11 AM - 7 PM.
Test: [1670/1679] Time 0.038 ( 0.035) Loss 9.4062e-03 (8.0129e-01) Acc@1 100.00 ( 73.59) Acc@5 100.00 ( 96.40)
* Acc@1 73.716 Acc@5 96.418
saved best model to: ./asl_model/model_best.pth.tar
$ python3 onnx_export.py --model-dir=<PATH-TO-YOUR-MODEL-DIR>
model exported to: ./asl_model/resnet18.onnx
Since I am lousy at ASL, I created a video of all the validation images as input to the NVIDIA compiled imagenet program from jetson-inference. You will need to build everything in jetson-inference first using these instructions
$ sudo apt-get update
$ sudo apt-get install git cmake libpython3-dev python3-numpy
$ git clone --recursive https://github.com/dusty-nv/jetson-inference
$ cd jetson-inference
$ mkdir build
$ cd build
$ cmake ../
$ make -j$(nproc)
$ sudo make install
$ sudo ldconfig
Once everything in jetson-inference is built, you can use the compiled imagenet program to test your model against my input video. Change the paths below to match your paths.
$ export PATH=$PATH:/home/dennis/external/jetson-inference/build/aarch64/bin
$ export DATASET=/home/dennis/external/sign-language-alphabet-recognizer/dataset
$ imagenet --model=./asl_model/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=$DATASET/labels.txt asl_val.avi
If everything went well, you should get a video window with the recognized ASL letter and the confidence level overlayed.
Thank you for taking the time to read these instructions. Please let me know if the instructions need improvement. Thank you.
