This program processes video frames to detect subtitles using various image processing techniques. It is done on this video https://drive.google.com/file/d/1mUOdZbcvRS_UTIoi4wGdfp9i8pNl3fYM/view
The program includes the following image processing steps:
- Pre-processing: Adjustments for noise reduction, contrast enhancement.
- Image Segmentation: Dividing an image into meaningful parts.
- Feature Extraction: Identifying and extracting important characteristics.
- Image Recognition: Assigning labels or classifications to objects.
Convert a 3-channel image into a one-channel grayscale image for faster and easier processing. The grayscale values range between [0:255].
Applies a Gaussian blur to the grayscale image (frame_gray
) using a kernel of size kernel_size x kernel_size
and a standard deviation of 5. This reduces noise and smooths the image.
cv2.threshold
is a function that applies a fixed-level threshold to each pixel in the image. cv2.THRESH_BINARY
converts the image to a binary image (black and white).
Working on a specific region to apply other processing steps without the need to apply these processes to the whole frame.
The code converts a region of interest (ROI) in a video frame from BGR to HSV color space and then creates a mask that isolates white colors within the specified HSV range. The mask highlights pixels in the ROI that fall within the range defined by lower_white
and upper_white
.
Morphological transformations are applied to remove noise and improve the mask.