/TJU-DHD

A newly built high-resolution dataset for object detection and pedestrian detection (IEEE TIP 2020)

MIT LicenseMIT

TJU-DHD dataset (object detection and pedestrian detection)

This is the official website for "TJU-DHD: A Diverse High-Resolution Dataset for Object Detection (TIP2020)", which is a newly built high-resolution dataset for object detection and pedestrian detection.

  • 115k+ images and 700k+ instances
  • Scenes: traffic and campus, Tasks: object detection and pedestrian detection
  • High resolution: image resolution of at least 1624x1200 pixels, the object height from 11 pixels to 4152 pixels.
  • Diversity: A large variance in appearance, scale, illumination, season, and weather
  • Cross-scene evaluation and same-scene evaluation on pedestrian detection
  • If you are interested in pedestrian detection, please refer to our IEEE T-PAMI paper or our github project.
  • Learderboard in Paperswithcode: TJU-Ped-campus, TJU-Ped-traffic

Examples of DHD

Table of Contents

  1. Introduction
  2. Object detection dataset
    2.1 TJU-DHD-traffic
    2.2 TJU-DHD-campus
  3. Pedestrian detection dataset
    3.1 TJU-Ped-traffic
    3.2 TJU-Ped-campus
  4. Benchmark
    4.1 TJU-DHD-traffic
    4.2 TJU-DHD-campus
    4.3 TJU-DHD-pedestrian
  5. Citation
  6. Evaluation on the test set
  7. Contact

1. Introduction

Vehicles, pedestrians, and riders are the most important and interesting objects in the perception modules of self-driving vehicles and video surveillance. However, the state-of-the-art performance of detecting such important objects (esp. small objects) is far from satisfying the demand of the practical systems. Large-scale, rich-diversity, and high-resolution vehicle and pedestrian datasets play an important role in developing better object detection methods to satisfy the demand. Existing public large-scale datasets such as MS COCO collected from websites do not focus on these specific scenarios. Moreover, the popular datasets (e.g., KITTI and Citypersons) collected from these specific scenarios are limited in the number of images and instances, the resolution, and the diversity in seasons, weathers, and illuminations. To attempt to solve the problem, in this paper, we build a diverse high-resolution dataset (called TJU-DHD). The dataset contains 115,354 high-resolution images (52% images have a resolution of 1624x1200 pixels and 48% images have a resolution of at least 2,560x1,440 pixels) and 709,330 labeled objects in total with a large variance in scale and appearance. Meanwhile, the dataset has a rich diversity in season variance, illumination variance, and weather variance. Based on this object dataset, a new diverse pedestrian dataset is further built. With the four different detectors (i.e., the one-stage RetinaNet, anchor-free FCOS, two-stage FPN, and Cascade R-CNN), experiments about object detection and pedestrian detection are conducted. We hope that the newly built dataset can help promote the research on object detection and pedestrian detection in these two scenes.

2. Object detection dataset

name DHD-traffic (#images) DHD-traffic (#instances) DHD-campus (#images) DHD-campus (#instances)
training 45,266 239,980 39,727 267,445
validation 5,000 30,679 5,204 41,620
test 10,000 60,963 10,157 68,643
total 60,266 331,622 55,088 377,708

2.1 TJU-DHD-traffic

2.2 TJU-DHD-campus

The training imageset is too large, thus is ziped as a 4-part archive. After downloading all four parts, you can open the .zip.001 using your favorite zip file extractor. On Linux, the multi-part archive can be also unzipped by

cat dhd_campus_train_images.zip.* > dhd_campus_train_images.zip
unzip dhd_campus_train_images.zip -d /path/to/your/folder

3. Pedestrian detection dataset

name Ped-traffic (#images) Ped-traffic (#instances) Ped-campus (#images) Ped-campus (#instances)
training 13,858 27,650 39,727 234,455
validation 2,136 5,244 5,204 36,161
test 4,344 10,724 10,157 59,007
total 20,338 43,618 55,088 329,623

3.1 TJU-Ped-traffic

(Note that the images are same as those in the TJU-DHD-traffic)

3.2 TJU-Ped-campus

(Note that the images are same as those in the TJU-DHD-campus)

4. Benchmark

4.1 TJU-DHD-traffic

  • Results on validation

    method backbone input size AP AP@0.5 AP@0.75 AP_s AP_m AP_l
    RetinaNet ResNet50 1333x800 53.5 80.9 60.0 24.0 50.5 68.0
    FCOS ResNet50 1333x800 53.8 80.0 60.1 24.6 50.6 68.8
    FPN ResNet50 1333x800 55.4 83.4 63.0 30.4 52.2 68.2
    Cascade RCNN ResNet50 1333x800 57.9 82.7 66.6 32.6 54.4 71.4

4.2 TJU-DHD-campus

  • Results on validation

    method backbone input size AP AP@0.5 AP@0.75 AP_t AP_s AP_l AP_l
    RetinaNet ResNet50 1333x800 48.4 79.3 52.4 4.7 27.3 56.2 73.8
    FCOS ResNet50 1333x800 49.3 73.8 53.8 5.6 29.6 55.9 74.3
    FPN ResNet50 1333x800 52.4 77.5 58.4 8.5 37.4 58.6 74.9
    Cascade RCNN ResNet50 1333x800 55.1 77.6 60.9 10.8 40.1 61.2 78.8

4.3 TJU-DHD-pedestrian

  • TJU-Ped-campus
Method publication R RS HO R+HO A link
RetinaNet ICCV2017 34.73 82.99 71.31 42.26 44.34 Paper
FCOS ICCV2019 31.89 69.04 81.28 39.38 41.62 Paper
FPN ICCV2017 27.92 67.52 73.14 35.67 38.08 Paper
CrowdDet CVPR2020 25.73 - 66.38 33.63 35.90 Paper
EGCL IEEE TIP2023 24.84 - 65.27 32.39 34.87 Paper
DeFCN CVPR2021 32.1 62.7 72.7 39.9 42.1 Paper
OPL CVPR2023 31.5 61.7 72.4 39.3 41.5 Paper
MTOM WACV2023 21.8 37.04 57.08 - - Paper
  • TJU-Ped-traffic
Method publication R RS HO R+HO A link
RetinaNet ICCV2017 23.89 37.92 61.60 28.45 41.40 Paper
FCOS ICCV2019 24.35 37.40 63.73 28.86 40.02 Paper
FPN ICCV2017 22.30 35.19 60.30 26.71 37.78 Paper
CrowdDet CVPR2020 20.82 - 61.22 25.28 36.94 Paper
EGCL IEEE TIP2023 19.73 - 60.05 24.19 35.76 Paper
DeFCN CVPR2021 24.2 29.1 62.8 29.0 39.7 Paper
Pedestron CVPR2021 18.9 24.0 56.3 - - Paper
OPL CVPR2023 23.4 28.8 62.7 28.0 38.7 Paper
LSFM CVPR2023 18.7 24.9 56.2 - - Paper
MTOM WACV2023 17.4 24.7 52.68 - - Paper
  • Cross-scene evaluation

    method R/R+HO (TJU-Ped-campus -> traffic) R/R+HO (TJU-Ped-traffic -> campus)
    FPN 30.62 / 33.89 42.08 / 50.55

5. Citation

If this project help your research, please consider to cite our works.

@article{Pang_DHD_TIP_2020,
         author = {Yanwei Pang and Jiale Cao and Yazhao Li and Jin Xie and Hanqing Sun and Jinfeng Gong},
         title = {TJU-DHD: A Diverse High-Resolution Dataset for Object Detection},
         journal = {IEEE Transactions on Image Processing},
         year = 2021
        }

@article{Cao_PDR_TPAMI_2020,
         author = {Jiale Cao and Yanwei Pang and Jin Xie and Fahad Shahbaz Khan and Ling Shao},
         title = {From Handcrafted to Deep Features for Pedestrian Detection: A Survey},
         journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
         year = 2022
        }

6. Evaluation on the test set

Ablation studies can be conducted on the validation set. If you would like to evaluate your model on the test set, you can send us (connor#tju.edu.cn, replace # with @) your detection results in the json format.

7. Contact

If you have any questions or want to add your results, please feel free to contact us.