Hand Gesture Recognition Using MediaPipe and OpenCV

Introduction

This project demonstrates a real-time hand gesture recognition system using MediaPipe and OpenCV. The system detects hands in the video feed, recognizes gestures, and draws hand landmarks with different colors for each hand. It recognizes basic gestures like "Thumbs Up" and "Thumbs Down."

Features

Real-time hand detection and gesture recognition.
Different color codes for left and right hands.
Easy-to-extend gesture recognition logic.

Example Of Real Time Hands Gesture recognition

Step BY step Guide

1. Importing Libraries

import cv2
import mediapipe as mp

cv2 is OpenCV, a library for computer vision tasks.
mediapipe is a library by Google for building machine learning solutions, such as hand tracking.

2. Initializing MediaPipe Hands and Drawing Utils

mp_hands = mp.solutions.hands
mp_drawing = mp.solutions.drawing_utils

mp_hands initializes the MediaPipe hands solution.
mp_drawing initializes the drawing utilities to draw landmarks.

3. Function to Detect Hands, Recognize Gestures, and Draw Landmarks

def detect_hands(frame):
    try:
        image_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        results = hands.process(image_rgb)

        if results.multi_hand_landmarks:
            for idx, hand_landmarks in enumerate(results.multi_hand_landmarks):
                hand_label = results.multi_handedness[idx].classification[0].label
                if hand_label == 'Left':
                    drawing_spec = mp_drawing.DrawingSpec(color=(0, 0, 255), thickness=2, circle_radius=4)
                else:
                    drawing_spec = mp_drawing.DrawingSpec(color=(0, 255, 0), thickness=2, circle_radius=4)

                mp_drawing.draw_landmarks(
                    frame, 
                    hand_landmarks, 
                    mp_hands.HAND_CONNECTIONS, 
                    drawing_spec,
                    mp_drawing.DrawingSpec(color=(0, 255, 0), thickness=2, circle_radius=2)
                )

                gesture = recognize_gesture(hand_landmarks)
                cv2.putText(frame, f"{hand_label} Hand: {gesture}", (10, 30 * (idx + 1)), 
                            cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2, cv2.LINE_AA)
    except Exception as e:
        print(f"Error in processing frame: {e}")

    return frame

Explanation:

Converts the frame from BGR (used by OpenCV) to RGB (used by MediaPipe).
Processes the RGB image to detect hands.
If hand landmarks are detected, iterates through each detected hand.
Determines if the detected hand is left or right.
Sets different drawing specifications (color and thickness) for left and right hands.
Draws landmarks on the detected hand using the specified drawing specifications.
Calls recognize_gesture to identify the gesture.
Displays the recognized gesture on the frame.

4. Enhanced Gesture Recognition Function

def recognize_gesture(hand_landmarks):
    thumb_tip = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_TIP]
    index_tip = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP]
    middle_tip = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_TIP]
    ring_tip = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_TIP]
    pinky_tip = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_TIP]

    if thumb_tip.y < index_tip.y and middle_tip.y < ring_tip.y and ring_tip.y < pinky_tip.y:
        return "Thumbs Up"
    elif thumb_tip.y > index_tip.y and middle_tip.y > ring_tip.y and ring_tip.y > pinky_tip.y:
        return "Thumbs Down"
    else:
        return "No gesture"

Explanation:

Extracts the coordinates of the fingertips for the thumb, index, middle, ring, and pinky fingers.
Uses simple conditional logic to recognize gestures.
Returns "Thumbs Up" if the thumb tip is above the index tip and the middle tip is above the ring tip.
Returns "Thumbs Down" if the thumb tip is below the index tip and the middle tip is below the ring tip.
Returns "No gesture" for any other configuration.

5. Initializing MediaPipe Hands Model

hands = mp_hands.Hands(static_image_mode=False, max_num_hands=2, min_detection_confidence=0.8, min_tracking_confidence=0.8)

Initializes the MediaPipe hands model with specific parameters:
- static_image_mode=False: Uses the model for video streams.
- max_num_hands=2: Detects up to two hands.
- min_detection_confidence=0.8: Sets the minimum confidence for the detection.
- min_tracking_confidence=0.8: Sets the minimum confidence for tracking.

6. Initializing Webcam

cap = cv2.VideoCapture(0)

Initializes the webcam for capturing video.

7. Capturing and Processing Video Frames

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        print("Failed to capture image from camera.")
        break

    frame = cv2.flip(frame, 1)  # Flip the frame horizontally

    frame = detect_hands(frame)

    cv2.imshow('MediaPipe Hand Detection', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

Explanation:

Continuously captures frames from the webcam.
Checks if the frame is captured successfully.
Flips the frame horizontally for a mirror effect.
Calls detect_hands to detect hands, recognize gestures, and draw landmarks.
Displays the processed frame.
Exits the loop if the 'q' key is pressed.

8. Releasing Resources

cap.release()
cv2.destroyAllWindows()
hands.close()

Releases the webcam.
Closes all OpenCV windows.
Releases the MediaPipe hands model.

graph TD;
    A[Start] --> B[Initialize MediaPipe Hands];
    B --> C[Initialize webcam];
    C --> D[Read frame from webcam];
    D --> E[Flip frame horizontally];
    E --> F[Detect Hands];
    F --> G[Recognize Gestures];
    G --> H[Draw Landmarks];
    H --> I[Display Frame];
    I --> J[Check for 'q' key press];
    J --> K[Release webcam];
    K --> L[Close OpenCV windows];
    L --> M[Close MediaPipe hands model];

graph TD;
    A[Start] --> B[Initialize MediaPipe Hands];
    B --> C[Initialize webcam];
    C --> D[Read frame from webcam];
    D --> E[Flip frame horizontally];
    E --> F[Detect Hands];
    F --> G[Recognize Gestures];
    G --> H[Draw Landmarks];
    H --> I[Display Frame];
    I --> J[Check for 'q' key press];
    J --> K[Release webcam];
    K --> L[Close OpenCV windows];
    L --> M[Close MediaPipe hands model];

RubyPdx/Hand-Gesture-Recognition-using-MediaPipe-and-OpenCV

Hand Gesture Recognition Using MediaPipe and OpenCV

Introduction

Features

Example Of Real Time Hands Gesture recognition

Step BY step Guide

1. Importing Libraries

2. Initializing MediaPipe Hands and Drawing Utils

3. Function to Detect Hands, Recognize Gestures, and Draw Landmarks

Explanation:

4. Enhanced Gesture Recognition Function

Explanation:

5. Initializing MediaPipe Hands Model

6. Initializing Webcam

7. Capturing and Processing Video Frames

Explanation:

8. Releasing Resources