This project implements an object detection system using the DETR (DEtection TRansformer) model from Hugging Face's Transformers library. It processes video frames to detect specific restricted classes of objects and draws bounding boxes around them.
- Detects specific restricted objects: person, cell phone, laptop, TV, keyboard, and mouse.
- Draws bounding boxes around detected objects with different colors.
- Saves processed frames with detections to a temporary directory.
To run this project, you'll need the following Python packages:
torch
transformers
Pillow
opencv-python
numpy
If you are using macOS, you can install FFmpeg using Homebrew. Open your terminal and run the following command:
brew install ffmpeg
-
Create a new Conda environment:
Open your terminal (or Anaconda Prompt) and run the following command:
conda create --name video_analyzer python=3.11
Replace
video_analyzer
with your preferred environment name if needed. -
Activate the Conda environment:
conda activate video_analyzer
-
Install required packages:
Once the environment is activated, install the required packages using the
requirements.txt
file provided in the project:pip install -r requirements.txt
To run the object detection on a video file, execute the script from the command line:
python main.py <path_to_video_file> --frame_rate=<frame_rate> --display_video=<display_video> --store_image_path=<image_path> --store_video_path=<video_path>
<path_to_video_file>
: The path to the input video file.--frame_rate
: (Optional) The rate at which frames are extracted from the video. The default value is 1 frame per second.
python main.py input_video.mp4
python main.py input_video.mp4 --frame_rate=1
python main.py input_video.mp4 --frame_rate=1 --display_video=True
python main.py input_video.mp4 --frame_rate=1 --store_image_path=/tmp/ai_files
python main.py input_video.mp4 --frame_rate=1 --store_video_path=/tmp/ai_files
python main.py input_video.mp4 --frame_rate=1 --store_image_path=/tmp/ai_files --display_video=True
python main.py ~/Movies/video_for_ai2.mp4 --frame_rate=1 --store_video_path=/tmp/ai_files --display_video=True --store_image_path=/tmp/ai_files
python -m unittest discover -s tests
This command will process input_video.mp4
, extracting 2 frames per second. The output frames with detected objects will be saved in the temporary directory /tmp/ai_files
.
Processed frames will be saved in the specified temporary directory with filenames in the format detected_frame_<index>_<timestamp>.png
, where <index>
is the frame index and <timestamp>
indicates when the frame was processed.
detection_model.py
: Contains the implementation of the DETR model and other related classes for object detection.frame_processor.py
: Handles the extraction and processing of video frames.detection_drawer.py
: Manages the drawing of bounding boxes and labels on the video frames.main.py
: The entry point of the application.
Kinn Coelho Juliao kinncj@gmail.com
This project is licensed under the MIT License. See the LICENSE file for details.