/YouTube-GDD

YouTube-GDD: A challenging gun detection dataset with rich contextual information

Primary LanguagePythonApache License 2.0Apache-2.0

YouTube-GDD: A challenging gun detection dataset with rich contextual information [arXiv] [Project Page]

Overview

To promote the development of security, this work presents a new challenging dataset called YouTube Gun Detection Dataset (YouTube-GDD). Our dataset is collected from 343 high-definition YouTube videos and contains 5000 well-chosen images, in which 16064 instances of gun and 9046 instances of person are annotated. Compared to other datasets, YouTube-GDD is "dynamic", containing rich contextual information and recording shape changes of the gun during shooting. To build a baseline for gun detection, we evaluate YOLOv5 on YouTube-GDD and analyze the influence of additional related annotated information on gun detection.


Dates

  • Release Training and validation sets. [2022-04]
  • Release test images. [2022-04]
  • Open the evaluation server to the public. [to be confirmed]
  • Augment dataset volumn to the level of ten thousand. [to be confirmed]

Description

  1. All images are captured from YouTube videos.

  2. All annotations are labeled in YOLO format with labelImg.

  3. YouTube-GDD contains two categories, namely "person" and "gun", corresponding to category ids 0 and 1, respectively.

  4. The name format of each image file and the corresponding label file is set as "YouTube id_original frame rate_split frame rate_ID".

Statistics

Firstly, we split the entire dataset into 10 nonoverlapping folds by filename, each containing 500 images. Secondly, we compute the ratio of different scales in the entire dataset as the probability distribution, and then compute the scale distribution of each fold. The two folds with the lowest JS divergence are chosen as test set and validation set, i.e., fold7 is chosen as the test set and fold6 is chosen as the validation set while the rest folds are adopted as the training set.

Split Images Videos Category Scale
person gun small medium large
fold1 500 35 467 1265 373 235 1124
fold2 500 34 430 620 4 84 962
fold3 500 31 466 905 39 259 1073
fold4 500 31 427 751 5 124 1049
fold5 500 36 471 716 11 120 1056
fold6 500 43 415 718 13 122 998
fold7 500 42 394 879 67 193 1013
fold8 500 34 475 636 1 60 1050
fold9 500 33 460 589 3 57 989
fold10 500 32 518 953 37 281 1151
all 5000 343 9046 16064 1106 3070 20934

Table Note: Frames captured from the same video may be assigned into two adjacent folds, causing the video to be repeatedly counted.

Construct YouTube-GDD from Source Videos

[Update 18th April] We thank a2515919 who is also working on the dataset and willing to share the pre-processed images: Google Drive Link.

Here, three scripts are provided for constructing YouTube-GDD from source videos step by step.

  • Download videos.
cd /path/to/YouTube-GDD/
python ./tools/download.py --videolist ./configs/videolist.txt  --videopath /path/to/videos
  • Extract frames.
cd /path/to/YouTube-GDD/
python ./tools/extract.py  --videopath /path/to/videos  --framepath /path/to/frames
  • Select images.
cd /path/to/YouTube-GDD/
python ./tools/select.py  --imagelist  ./configs/imagelist.npy  --framepath /path/to/frames --imagepath /path/to/images

After collecting images, unzip labels.zip to the parent root of imagepath and the expected dataset structure should be organized as follows, which also meets the dataset structure requirement of YOLOv5.

YouTube-GDD/
  images/
      train/
      val/
      test/
  labels/
      train/
      val/

Baseline

Method w/ TL w/ AoP FLOPs Params Gun Person
AP50 AP AP50 AP
YOLOv5s 15.80G 7.01M 67.7 41.0 - -
yes 15.81G 7.02M 67.9 41.3 90.3 75.0
yes 15.80G 7.01M 75.0 52.0 - -
yes yes 15.81G 7.02M 77.3 52.1 92.4 81.2

Table Note: TL means Transfer Learning and AoP means Annotations of Person.

Contact

If you have any general question, feel free to email us at guyongxiang19@mails.ucas.ac.cn. If you have dataset-related or implementation-related questions, please feel free to send emails to us or open an issue in this codebase (We recommend that you open an issue in this codebase, because your questions may help others).

Citation

If you find our work inspiring or use our dataset in your research, please cite our work.

@article{gu2022youtube-gdd,
  title={YouTube-GDD: A challenging gun detection dataset with rich contextual information},
  author={Gu Yongxiang and Liao Xingbin and Qin Xiaolin},
  journal={arXiv preprint arXiv:2203.04129},
  year={2022}
}

Thanks

We thank Lab students, namely Mingfei Li, Jingyang Shan, Qianlei Wang, Siqi Zhang, Xu Liao, Yuncong Peng, Gang Luo, Xin Lan, Boyi Fu and Yangge Qian, for their suggestions about improving the YouTube-GDD dataset.