A survey on deep learning-based single image crowd counting: Network design, loss function and supervisory signal
- Single image crowd counting is a challenging computer vision problem with wide applications in public safety, city planning, traffic management, etc.
- This survey is to provide a comprehensive summary of recent advanced crowd counting techniques based on Convolutional Neural Network (CNN) via density map estimation.
- Our goals are to provide an up-to-date review of recent approaches, and educate new researchers in this field the design principles and trade-offs.
Our long survey paper (23 pages) is accepted to Neurocomputing 2022 paper
- Privacy preserving crowd monitoring: Counting people without people models or tracking [paper]
- Learning to count objects in images [paper]
- Towards perspective-free object counting with deep learning [paper]
- Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection [paper]
- Shape-based human detection and segmentation via hierarchical part-template matching [paper]
- Counting crowded moving objects [paper]
- Bayesian poisson regression for crowd counting [paper]
- Counting people with low-level features and Bayesian regression [paper]
- Deep people counting in extremely dense crowds [paper]
More recently, crowd counting via density map estimation has emerged as a promising approach with encouraging results. Such approaches achieve high accuracy for crowded scenes and preserve spatial information of people distribution.
We summarize by comparing the aforementioned three major crowd counting approaches in the following table.
we review the recent advances with detailed comparisons on three major design modules for crowd counting: deep neural network designs, loss functions, and supervisory signals.
- Image resolution
- Number of images in the dataset
- Object count
We extract and present some typical images from the public datasets in the following figure.
-
NWPU-Crowd: Nwpu-crowd: A large-scale benchmark for crowd counting [paper]
-
UCF_QNRF: Composition loss for counting, density map estimation and localization in dense crowds [paper]
-
GCC: Pixel-Wise Crowd Understanding via Synthetic Data [paper]
-
Fudan-ShanghaiTech: Locality-constrained spatial transformer network for video crowd counting [paper]
-
ShanghaiTech A & B: Single-image crowd counting via multi-column convolutional neural network [paper]
-
WorldExpo'10: Cross-scene crowd counting via deep convolutional neural networks [paper]
-
UCF_CC_50: Multi-source multi-scale counting in extremely dense crowd images [paper]
-
Mall: Feature mining for localised crowd counting [paper]
-
UCSD: Privacy preserving crowd monitoring: Counting people without people models or tracking [paper]
- Accuracy: counting accuracy and location accuracy
- Quality of density map: resolution and visual quality
- Complexity: computation complexity and annotation complexity
- Flexibility and robustness
- Fully convolutional crowd counting on highly congested scenes [paper]
- Scale aggregation network for accurate and efficient crowd counting [paper]
- Crowd counting and density estimation by trellis encoder-decoder networks [paper]
-
Single-image crowd counting via multi-column convolutional neural network [paper]
-
Improving the learning of multi-column convolutional neural network for crowd counting [paper]
-
Crowd counting by adaptively fusing predictions from an image pyramid [paper]
-
Generating high-quality crowd density maps using contextual pyramid cnns [paper]
-
An aggregated multicolumn dilated convolution network for perspective-free counting [paper]
-
Denet: A universal network for counting crowd with varying densities and scales [paper]
-
Dadnet: Dilated-attention-deformable convnet for crowd counting [paper]
-
Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding [paper]
-
SCAR: Spatial-/channel-wise attention regression networks for crowd counting [paper]
-
Relational attention network for crowd counting [paper]
-
Attend to count: Crowd counting with adaptive capacity multi-scale CNNs [paper]
- Hybrid Graph Neural Networks for Crowd Counting [paper]
-
Crowd counting using deep recurrent spatial-aware network [paper]
-
End-to-end crowd counting via joint learning local and global count [paper]
-
Where are the blobs: Counting by localization with point supervision [paper]
-
Decidenet: Counting varying density crowds through attention guided detection and density estimation [paper]
- Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes [paper]
-
Crowd counting with deep structured scale integration network [paper]
-
Cross-Level Parallel Network for Crowd Counting [paper]
-
Adversarial learning for multiscale crowd counting under complex scenes [paper]
-
Atrous convolutions spatial pyramid network for crowd counting and density estimation [paper]
-
Crowd counting via scale-adaptive convolutional neural network [paper]
-
Counting with focus for free [paper]
-
Learning to count with cnn boosting [paper]
-
Nonlinear regression via deep negative correlation learning [paper]
-
From open set to closed set: Counting objects by spatial divide-and-conquer [paper]
-
Adaptive density map generation for crowd counting [paper]
-
Bayesian loss for crowd count estimation with point supervision [paper]
-
Ha-ccn: Hierarchical attention-based crowd counting network [paper]
-
Generalizing semi-supervised generative adversarial networks to regression using feature contrasting [paper]
-
Almost unsupervised learning for dense crowd counting [paper]
-
Leveraging unlabeled data for crowd counting by learning to rank [paper]
-
Learning from synthetic data for crowd counting in the wild [paper]
-
Focus on semantic consistency for cross-domain crowd understanding [paper]
- Automatic and lightweight network designing
- Weakly supervised and unsupervised crowd counting
- Crowd counting in videos
- Multi-view fusion for crowd counting
If you find this work or code useful, please cite:
@article{bai2022survey,
title={A survey on deep learning-based single image crowd counting: Network design, loss function and supervisory signal},
author={Bai, Haoyue and Mao, Jiageng and Chan, S-H Gary},
journal={Neurocomputing},
year={2022},
publisher={Elsevier}
}