In this work we show the transition from non-neural methods, like Histogram-of-Gradients + SVM, to neural methods, like Faster RCNN, for object detection, specifically, pedestrian detection. We use Penn-Fudan Pedestrian Detection Dataset for evaluating our model's performance.
Model | Mean Average Precision (mAP) | Average Recall @ 1dpi | Average Recall @ 10dpi |
---|---|---|---|
Pretrained HoG Detector (on INRIA Person dataset) | 0.04 | 0.06 | 0.14 |
Custom HoG Detector | 0.15 | 0.13 | 0.29 |
Faster-RCNN | 0.76 | 0.30 | 0.82 |
git clone https://github.com/sm354/Pedestrian-Detection.git
cd Pedestrian-Detection
pip install -r requirements.txt
PennFudanPed_train.json
, and PennFudanPed_val.json
contains COCO annotations for a randomly generated train-val split of the PennFudan dataset.
wget https://www.cis.upenn.edu/~jshi/ped_html/PennFudanPed.zip
unzip PennFudanPed.zip
gdown 1zfU44JxyHCUSWJ7ngiKyqJocgdF82pVt
python eval_hog_pretrained.py --root <path to dataset root directory> --test <path to test json> --out <path to output json>
Training (HoG descriptors + SVM Model)
python train_hog_custom.py --root <path to dataset root directory> --train <path to train json> --model <path to save trained SVM model>
Testing
python eval_hog_custom.py --root <path to dataset root directory> --test <path to test json> --out <path to output json> --model <path to trained SVM model>
python eval_faster_rcnn.py --root <path to dataset root directory> --test <path to test json> --out <path to output json>
python eval_detections.py --gt <path to ground truth annotations json> --pred <path to detections json>
The script eval_detections.py
takes in ground truth annotations and predicted detections for the evaluation dataset and computes the following metrics:
- Average Precision, computed over 10 IOU thresholds in the range 0.5:0.05:0.95
- Average Recall computed at 1 detection per image.
- Average Recall comptued at 10 detections per image.
Course assignment in Computer Vision course (course webpage) taken by Prof. Chetan Arora