When you try to create a real-time detection model using camera frames, sometimes other factors effects the way that model works such as lighting, background color etc. Therefore, this project removes the feature extraction from CNN model to do sign language classification. And we focus on the landmark's pixel positions. This way we only care about the relations between landmarks and nothing else.
$ git clone https://github.com/ekrrems/sign_language_classification_MediaPipe