This repository contains a Keras implementation of the paper 'MotionRec: A Unified Deep Framework for Moving Object Recognition' accepted in WACV 2020. To the best of our knowledge, this is a first attempt for simultaneous localization and classification of moving objects in a video, i.e. moving object recognition (MOR) in a single-stage deep learning framework.
We used the code base of fizyr/keras-retinanet for our work. We forked off from fizyr/keras-retinanet tag 0.5.1.
Due to lack of available benchmark datasets with labelled bounding boxes for MOR, we created a new set of ground truths by annotating 42,614 objects (14,814 cars and 27,800 person) in 24,923 video frames from CDnet 2014 dataset. We selected 16 video sequences having 21,717 frames and 38,827 objects (13,442 cars and 25,385 person) for training. For testing, 3 video sequences with 3,206 frame and 3,787 objects (1,372 cars and 2,415 person) were chosen. We created axis-aligned bounding box annotations for moving object instances in all the frames.
MotionRec: A Unified Deep Framework for Moving Object Recognition
@InProceedings{Mandal_2020_WACV, author = {Mandal, Murari and Kumar, Lav Kush and Saran, Mahipal Singh and vipparthi, Santosh Kumar}, title = {MotionRec: A Unified Deep Framework for Moving Object Recognition}, booktitle = {The IEEE Winter Conference on Applications of Computer Vision (WACV)}, month = {March}, year = {2020} }
https://drive.google.com/open?id=16hC3S0Vuz9ZNC2sxI7UuONiBtkA8S_lA
- Clone this repository.
- Ensure numpy is installed using
pip install numpy --user
- In the repository, execute
pip install . --user
. Note that due to inconsistencies with howtensorflow
should be installed, this package does not define a dependency ontensorflow
as it will try to install that (which at least on Arch Linux results in an incorrect installation). Please make suretensorflow
is installed as per your systems requirements. - Run
python setup.py build_ext --inplace
to compile Cython code first.
MotionRec can be trained using this train.py script. Note that the train script uses relative imports since it is inside the keras_retinanet package.
Model can be trained on all csv files in --csv_path CSV_PATH
folder.
Randomly select a video, read continuous frames (1-10)10 (if depth of history frame is 10) frames, calculate temporal median and TDR block features and perform estimation using resnet 50 and retinanet pyramid. Next choose 11th frame with 10 previous history frames ([1:11] ) and perform estimation.
At end of each video, randomly select new video, start with depth frames.
The CSV file with annotations should contain one annotation per line. Images with multiple bounding boxes should use one row per bounding box. Note that indexing for pixel values starts at 0. The expected format of each line is:
path/to/image.jpg,x1,y1,x2,y2,class_name
By default the CSV generator will look for images in csv_folder/train/ directory of the annotations (csv) file.
The class name to ID mapping file should contain one mapping per line. Each line should use the following format:
class_name,id
Indexing for classes starts at 0. Do not include a background class as it is implicit.
- Keras version: 2.3.1
- tf version: 1.15.0
- CUDA Version: 10.0.130
Model summary can be found in model_summary.md. Snapshot and model of trained Motionrec model is attached in snapshots folder.
keras_retinanet/bin/train.py csv classes.csv
Positional arguments:
{csv} Arguments for csv dataset types.
Optional arguments:
--backbone BACKBONE Backbone model used by retinanet. (default : resnet50)
one of the backbones in resnet models (resnet50, resnet101, resnet152)
--epochs EPOCHS Number of epochs to train.
--steps STEPS Number of steps per epoch. (default: 10000)
--lr LR Learning rate. (default: 1e-5)
--snapshot-path SNAPSHOT_PATH
Path to store snapshots of models during training
(defaults to './snapshots')
--depth DEPTH Image frame depth. (default: 10)
--initial_epoch INITIAL_EPOCH (default: 1)
initial epochs to train.
--csv_path CSV_PATH (default: './csv_folder/train/')
Path to store csv files for training
The training procedure of MotionRec works with training models. These are stripped down versions compared to the inference model and only contains the layers necessary for training (regression and classification values). If you wish to do inference on a model (perform object detection on an image), you need to convert the trained model to an inference model.
keras_retinanet/bin/convert_model.py snapshots/resnet50_csv_60.h5 snapshots/model.h5
positional arguments:
model_in The model to convert.
model_out Path to save the converted model to.
optional arguments:
--backbone BACKBONE The backbone of the model to convert.
Use the trained model to test the unseen videos.
Model can be tested on all csv files in --csv_path CSV_PATH. And output is stored in --save-path SAVE_PATH
folder.
keras_retinanet/bin/evaluate.py csv classes.csv snapshots/model.h5
positional arguments:
{csv} Arguments for specific dataset types.
model Path to RetinaNet model.
optional arguments:
--depth DEPTH Image frame depth.
--csv_path CSV_PATH Path to store csv files for training
(default: './csv_folder/test/')
--iou-threshold IOU_THRESHOLD
IoU Threshold to count for a positive detection
(defaults to 0.5).
--save-path SAVE_PATH
Path for saving images with detections
By default the CSV generator will look for images in visualize/test/
directory of the annotations (csv) file.
Evaluation output is stored in outputs/FOLDERNAME/
.
MotionRec layer can be visualized using keras_retinanet/bin/visualize.py
keras_retinanet/bin/visualize.py csv classes.csv snapshots/model.h5
positional arguments:
{csv} Arguments for specific dataset types.
model Path to RetinaNet model.
Optional arguments:
--depth DEPTH Image frame depth.
--csv_path CSV_PATH Path to store csv files for layer visualization
--save-path SAVE_PATH
Path for saving images with visualization
--layer LAYER Name of the CNN layer to visualize.
--layer_size LAYER_SIZE CNN layer size
--score-threshold SCORE_THRESHOLD
Threshold on score to filter detections with (defaults
to 0.05).
--iou-threshold IOU_THRESHOLD
IoU Threshold to count for a positive detection
(defaults to 0.5).
By default the CSV generator will look for images in visualize/backdoor/csv/
directory of the annotations (csv) file.
Visualization is stored in visualize/backdoor/LAYER_NAME/output/
.
MotionRec model demo can be executed using keras_retinanet/bin/demo.py
keras_retinanet/bin/demo.py csv classes.csv snapshots/model.h5
positional arguments:
{csv} Arguments for specific dataset types.
model Path to RetinaNet model.
optional arguments
--depth DEPTH image frame depth.
--video-path VIDEO_PATH
video name
NOTE: Snapshot and model of trained Motionrec model can be found in snapshots folder.
Visualization of pyramid levels P3, P4, P5 and P6. The relevant motion saliencies of moving objects are highlighted using red boxes.
Qualitative results of our method for unseen video sequences sofa, winterdriveway and parking from CDnet 2014 dataset.
We forked off from fizyr/keras-retinanet tag 0.5.1.
Thanks to fizyr/keras-retinanet team.
Lav Kush Kumar (lavkushkumarmnit@gmail.com)
Murari Mandal (murarimandal.cv@gmail.com)
Mahipal Singh Saran (mahipalsaran007@gmail.com)
Santosh Kumar Vipparthi (skvipparthi@mnit.ac.in)
Realtime demo of MotionRec model.