📹 Intelligent CCTV for Port Safety


1.   Background of Development

  • Recently, CCTV-related technologies have been used in various ways in our daily lives, such as security and safety accident prevention. However, traditional CCTV cameras record more unnecessary information than is needed when a problem occurs. In addition, it is difficult to fully recognize and judge the site only with existing CCTV cameras. The Korea Safety and Health Agency announced that six deaths occurred every year at domestic ports during 2011-2021. It shows the limitations that existing CCTV cameras cannot solve safety accidents and human casualties in domestic ports. In order to solve these limitations, I devised ""Intelligent CCTV for Port Safety & Real-Time Information Provision System" that can quickly and accurately respond to dangerous situations while checking the site in real time.


2.   Project Introduction

  • The project used a variety of object detection and action recognition models based on traditional computer vision technique and deep neural network.

  • Based on the various deep learning models mentioned above, the following algorithms are implemented :

    • Region of Interest (ROI)

      • Regions of interest (ROI) means a meaningful and important regions in an images.
      • The ROI algorithm is implemented based on a binary mask, one of image processing techniques.
      • In the mask image, pixels that belong to the ROI are set to 1 and pixels outside the ROI are set to 0.
    • DeepSort

      • DeepSort is the tracking algorithm which tracks objects not only based on the velocity and motion of the object but also the appearance of the object.
    • Measuring Distance Between Objects

      • The algorithm detects and analyzes low-risk and high-risk state by measuring the distance between the bounding box centers of objects.
      • When measuring distance, measure the distance between all objects detected in the image by reflecting the characteristics of a complete graph.
    • Time Series Data Analysis

      • This algorithm detects and analyzes time series data using a queue, and its contents are as follows :

        1. If the data recognized through the action recognition model is judged to be abnormal behavior, the penalty score is sequentially stored in the queue, which is a linear data structure. (Conversely, if the recognized data is judged to be normal behavior, the advantage score is stored.)
        2. At the same time, scores, which are time series data previously stored in the queue, are deleted from the queue by the FIFO (First In First Out) method of the queue.
        3. By analyzing the sum of the scores in the queue in real time, if the value is an outlier, it is judged that it is currently a very dangerous situation.

  • Through the algorithms mentioned above, the following events are detected and analyzed :

    • Intrusion and Loitering
    • Risk of Access to Restricted Areas
    • Risk of Collision Between Workers
    • Risk of Not Wearing Worker Safety Equipment
    • Fire Accident
    • Act of Smoking
    • Act of Falling
    • Act of Violence

  • The analyzed information is stored in the database, and the administrator can quickly identify the current situation through text and graph-type information provided in real time.

    • The information can be checked not only on PC but also on smartphone.

  • If these functions are applied to the integrated control center in the port, a smart port monitoring system capable of efficient management and operation can be established.


3.   Main Function

  • Region of Interest (ROI)


    • The administrator can set the ROI to a rectangle or polygon shape by dragging or clicking the mouse.

    • Object detection is processed only within the Red ROI Border.

      • Green ROI Border    :   Specify
      • Yellow ROI Border   :   Modify
      • Red ROI Border       :   Setup Complete
    • Through this, the administrator can improve the processing speed and accuracy of object detection by removing unnecessary areas from the image.


  • Intrusion and Loitering


    • The object detection model detects human intrusion, and if the intruder stays for a long time, it is judged that intruder is loitering.

      • Intruder   :   The 'Intrusion' text is displayed in green at the top of the bounding box.
      • Loiterer   :   The 'Pedestrian loitering' text is displayed in purple at the top of the bounding box.
    • Even if the intruder and loiterer appear again after being covered by another object or going out of the video, they are recognized as the same person because the DeepSort algorithm has been applied.

      • First, when an intruder appears, the intruder is given a unique ID number, and the intruder's information is stored in the database with the ID number.
      • When the intruder reappears, it is recognized as the same person by the DeepSort algorithm and given the unique ID number previously given.
      • It then applies the previous information about the intruder by querying the unique ID number from the database.
    • Through this, the administrator can individually detect and analyze whether many people in the port are intruding and loitering.


  • Risk of Access to Restricted Areas


    • The administrator can set the restricted area to a rectangle shape by dragging the mouse.

      • When the restricted area setting is completed, the restricted area is displayed as a white bounding box.
    • Based on the algorithm of Measuring Distance Between Objects, when a person approaches the restricted area, a warning or danger message is sent to the administrator.

      • Low-Risk State    :   The border of the bounding box is displayed in yellow.
      • High-Risk State   :   The border of the bounding box is displayed in red.
    • Through this, the administrator can proactively block people from entering restricted areas within the port.


  • Risk of Collision Between Workers


    • Based on the algorithm of Measuring Distance Between Objects, if there is a possibility that the safe distance between workers is not secured, a warning or danger message is sent to the administrator.

      • Low-Risk State    :   The safe distance between workers is displayed in yellow.
      • High-Risk State   :   The safe distance between workers is displayed in red.
    • Through this, the administrator can prevent collision accidents caused by the failure of workers to secure the safe distance in a dense space.


  • Risk of Not Wearing Worker Safety Equipment


    • First, the object detection model detects a worker, a safety helmet, and a safety vest.

    • Next, based on the algorithm of Measuring Distance Between Objects, it analyzes whether the worker wears safety equipment.

    • If the worker is not wearing safety equipment, a warning or danger message is sent to the administrator.

      • Low-Risk State    :   The Δ symbol is displayed in yellow in the bounding box.
      • High-Risk State   :   The X symbol is displayed in red in the bounding box.
    • Through this, the administrator can prevent safety accidents caused by workers not wearing safety equipment at the work site.


  • Act of Smoking


    • If the action recognition model recognizes the behavior of smoking or lighting a cigarette, a danger message is sent to the administrator.

      • The 'Smoking Action' text is displayed in red at the top of the bounding box.
      • The bounding box filled in purple is displayed.
    • Through this, administrator can prevent fire accidents by quickly stopping people who smoke or light cigarettes in the hazardous areas within the port.


  • Act of Falling


    • First, the action recognition model recognizes the behavior of a worker who have fallen due to a fall accident in real time.

    • Next, based on the algorithm of Time Series Data Analysis, the current situation is judged as a safety stage, a warning stage, or a danger stage.

      • The warning stage is a situation where it is judged that a minor accident has occurred.

        1. The 'Warning Action' text is displayed in orange at the top of the bounding box.
        2. The bounding box filled in orange is displayed.
      • If it is judged that the injury is so serious that the worker cannot move, the warning stage is converted to the danger stage.

        1. The 'Dangerous Action' text is displayed in red at the top of the bounding box.
        2. The bounding box filled in red is displayed.
      • After converting to the warning or danger stage, if the worker has regained consciousness and stands up, it is converted back to the safety stage.

    • Through this, administrator can prioritize and respond to more dangerous situations even if multiple accidents occur simultaneously.


  • Act of Violence


    • First, the action recognition model recognizes the behavior of making violent contact with another person's body in real time.

    • Next, based on the algorithm of Time Series Data Analysis, the current situation is judged as a safety stage, a warning stage, or a danger stage.

      • The warning stage is a situation that is judged to be minor violence or contact.

        1. The 'Warning Action' text is displayed in orange at the top of the bounding box.
        2. The bounding box filled in orange is displayed.
      • If it is judged that the violence is serious and needs to be restrained, the warning stage is converted to a danger stage.

        1. The 'Dangerous Action' text is displayed in red at the top of the bounding box.
        2. The bounding box filled in red is displayed.
      • After converting to the warning or danger stage, if violent behavior is restrained and not recognized, it is converted back to the safety stage.

    • Through this, administrator can identify and restrain violent situations early.


4.   Real-Time Information Provision System


  • Since safety accidents occur at unexpected moments, it is important to check the site in real time and take prompt action.

  • Therefore, this project provides an information provision system that allows managers to check the site in real time.

  • First, after analyzing image data collected through Intelligent CCTV for Port Safety the following information is stored in a database.

    • Image of the On-Site
    • The Number of People at the On-Site
    • Safety Numerical Values at the On-site
    • Identification Number of People at the On-Site
    • Type of Event
    • Occurrence Time of Event
    • Warning and Danger Stage of Event

  • Then, the information stored in the database is provided to the administrator's PC monitor or application in the form of text, image, graph, etc.

  • Through this, administrator can check the situation in real time anywhere, not limited to places, and respond quickly to problems in the site.


5.   YOLO Model Training Strategies Using Transfer-Learning & Fine-Tuning

  • Transfer-Learning & Fine-Tuning Definition

    • Transfer learning consists of taking features learned on one problem, and leveraging them on a new, similar problem.
    • Transfer learning is usually done for tasks where your dataset has too little data to train a full-scale model from scratch.
    • The most common incarnation of transfer learning in the context of deep learning is the following workflow.
      1. Take layers from a previously trained model.
      2. Freeze them, so as to avoid destroying any of the information they contain during future training rounds.
      3. Add some new, trainable layers on top of the frozen layers. They will learn to turn the old features into predictions on a new dataset.
      4. Train the new layers on your dataset.
    • A last, optional step, is fine-tuning, which consists of unfreezing the entire model you obtained above (or part of it), and re-training it on the new data with a very low learning rate.


  • My Fine-Tuning Strategy


Strategy Method Feature
Strategy 1 Train the entire model. In this situation, it is possible to use the architecture of the pre-trained model and train it according to the dataset. It is recommended for large datasets.
Strategy 2 Train some layers and leave the others frozen. In a CNN architecture, lower layers refer to general features (problem independent), while higher layers refer to specific features (problem dependent). In this case, we have to adjust the weights of the network. This option is useful when we have a small dataset and a large number of parameters, we need to leave more layers frozen to avoid overfitting. On the other hand, if the dataset is large and the number of parameters is small, it is possible to improve the model by training more layers to the new task.
Strategy 3 Freeze the convolutional base. In this situation, we have an extreme case of the train/freeze trade-off. The rationale behind it is to keep the original form of the convolutional base to use as input for the classifier. By this way, the pre-trained model plays the role of a feature extractor. It can be interesting for small datasets or if the problem solved by the pre-trained model is similar to the one we are working on.



  • Results Based on 3 Strategies



    • Strategy 1   :   Yellow,      Strategy 2   :   Pink,      Strategy 3   :   Purple
    • Strategy 1 shows the best results.
    • I think the reasons for this result are as follows.
    • The dataset I used is a large dataset and has little resemblance to the dataset of pre-trained models



💻 S/W Development Environment

🚀 Deep Learning Model

💾 Datasets used in the project

        COCO Dataset
        Color Helmet and Vest Dataset
        HMDB51 Dataset
        Something-Something-V2 Dataset