wolterlw/hand_tracking

Train Hand gesture recognition

Closed this issue · 6 comments

I wanna do train hand gesture using your hand tracking but I dont know where to start.

  1. data collection:
    you need a dataset with labeled video sequences, which I assume you have, or if you don't I've seen a paper by some people at MIT who did gesture recognition, hopefully they used public data
  2. run the hand pose estimation algo and see how well it performs, most likely you'll need to do some post processing to at least smooth out the predictions in time, as the current iteration does pose estimation frame-by-frame
  3. perform cleaning on the new pose-labeled dataset you have: pick subsequences that are reasonably well processed by the pose predictor

after all that you should have a decent starting point to train a gesture recognition algo

What do you mean by hand pose estimation? is it the one that i run the imported hand tracking or i have to create hand estimation algo?

I meant the hand tracker

is there any way that I can do something like this in the Hand Tracker?

float pseudoFixKeyPoint = landmarkList.landmark(2).x();
if (landmarkList.landmark(3).x() < pseudoFixKeyPoint && landmarkList.landmark(4).x() < pseudoFixKeyPoint)
{
thumbIsOpen = true;
}

pseudoFixKeyPoint = landmarkList.landmark(6).y();
if (landmarkList.landmark(7).y() < pseudoFixKeyPoint && landmarkList.landmark(8).y() < pseudoFixKeyPoint)
{
    firstFingerIsOpen = true;
}

pseudoFixKeyPoint = landmarkList.landmark(10).y();
if (landmarkList.landmark(11).y() < pseudoFixKeyPoint && landmarkList.landmark(12).y() < pseudoFixKeyPoint)
{
    secondFingerIsOpen = true;
}

pseudoFixKeyPoint = landmarkList.landmark(14).y();
if (landmarkList.landmark(15).y() < pseudoFixKeyPoint && landmarkList.landmark(16).y() < pseudoFixKeyPoint)
{
    thirdFingerIsOpen = true;
}

pseudoFixKeyPoint = landmarkList.landmark(18).y();
if (landmarkList.landmark(19).y() < pseudoFixKeyPoint && landmarkList.landmark(20).y() < pseudoFixKeyPoint)
{
    fourthFingerIsOpen = true;
}

well the code you've pasted is C++ so there's a lot of extra complication of reading predicted landmarks, but generally speaking yes, you can do the same thing with the code in this repo.
First you follow the provided example and get a prediction dict that has all the keypoints in a numpy array, then you write custom code to do the above.

thanks