Multi-Camera Networks

Multi-camera Networks research notes. Target venues: network conferences (NSDI/SIGCOMM), mobile conferences (MobiCom/MobiSys/SenSys/UbiComp) and computer vision conferences (ICCV/CVPR/ECCV).
Inspired by book, I collect papers from four topics in research opportunities:

  1. Camera Calibration.
  2. AI Applications (surveilliance systems, multi-view collaboration, multi-camera collaboration, efficient object detection, automatic labeling, MTMC tracking).
  3. Video Compression (for efficient communication).
  4. Database (for fast indexing).
  5. Privacy (for privacy-preserving inference/training/transmission).

In the end, I list datasets and useful toolboxes (I will keep maintaining this list).

Book

  1. Multi-Camera Networks: Principles and Applications. 2005.
  2. Camera Networks: The Acquisition and Analysis of Videos over Wide Areas (Synthesis Lectures on Computer Vision). 2012.

Survey

[1] M.Valera et al. Intelligent distributed surveillance systems: a review. 2005.
[2] Wang et al. Intelligent multi-camera video surveillance: a review. 2012.
[3] Ye et al. Wireless Video Surveillance: A Survey. 2013.
[4] Zhang et al. Deep Learning in Mobile and Wireless Networking: A Survey. IEEE TRANS 2019.

Researchers, labs and workshops

Researchers (organization and research interests)

  1. Ganesh Ananthanarayanan (MSR, USA) - Live video analytics, distributed computing
  2. Yuanchao Shu (MSR, USA) - Live video analytics, location-based systems
  3. Andrea Cavallaro (QMUL, ENG) - Low-level vision tasks across camera networks, multi-modal fusion, privacy-aware video analytics (based on adversarial-training/learning)
  4. Amit K. Roy-Chowdhury (UC Riverside, USA) - Deep learning based video analytics (tracking, reID, super-resolution and domain adaptation)
  5. Jenq-Neng Hwang (UW, USA) - Deep learning based video analytics (tracking, reID, localization, visual odometry)
  6. Hamid K. Aghajan (UGent, BE) - Video analytics across multi-cameras
  7. Umakishore Ramachandran (Gatech, USA) - Edge AI (OS, kernel)
  8. Youngki Lee (SNU, KR) - Edge AI and AR/VR
  9. Juncheng Jiang (UChicago, USA) - Video streaming
  10. Ravi Netravali (Princeton, USA) - Edge AI
  11. Silvio Savarese (Stanford, USA) - 3D vision and robotics
  12. Fengyuan Xu (NJU, CN) - the Internet of Video Things (IoVT) and Privacy-preserving edge AI
  13. Hamed Haddadi (ICL, UK) - Privacy-preserving edge AI

Labs

  1. Live Video Analytics (MSR)
  2. Information Processing Lab (Washington)
  3. Video Computing Group (UC Riverside)
  4. Vision Research Lab (UCSB)
  5. Audio-visual Signal processing and Communication Systems (Berkeley)
  6. Human-Centered Computer Systems Lab (SNU)

Workshops

  1. The 3rd Workshop on Hot Topics in Video Analytics and Intelligent Edges (ACM MobiCom'21) - focus on deep learning based video analytics.

Courses

  1. CS231A: Computer Vision, From 3D Reconstruction to Recognition (Winter 2021, Stanford) - focus on basic concepts behind many computer vision tasks across multi-camera networks (camera models, calibration, single- and multiple-view geometry, stereo systems, sfm, stereo, matching, depth estimation, optical flow and optimal estimation).
  2. CS239: ML-driven Video Analytics Systems (Fall 2020, UCLA) - target to recent research interests on video analytics (Strong Recommendation).
  3. CS34702 Topics in Networks: Machine Learning for Networking and Systems (Fall 2020, UChicago) - target to awesome recent research works on netwoking system (video streaming and cloud scheduing are recommended).
  4. CS6465: Emerging Cloud Technologies and Systems Challenges (Fall 2019, Cornell) - focus on basic concepts behind existing edge-AI (although most papers are published before 2019 and related with infrastructure, reading them will help us to get/understand key issues of current Edge-AI).

Research directions

Camera calibration

[1] Calibration Wizard: A Guidance System for Camera Calibration Based on Modeling Geometric and Corner Uncertainty. In ICCV'19.

AI applications (todo)

Surveilliance systems (reducing deployment cost)

[1] Zhang et al. The Design and Implementation of a Wireless Video Surveillance System. In MobiCom'15.
[2] Jain et al. Scaling Video Analytics Systems to Large Camera Deployments. In HotMobile'19.
[3] Xu et al. Approximate Query Service on Autonomous IoT Cameras. In MobiSys'20.
[4] Bhardwaj et al. Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers. In NSDI'22. - target to solve when to retrain models and how to reduce resource usage for multi-tasks (many inference and retraining tasks).
[5] Suprem et al. ODIN: Automated Drift Detection and Recovery in Video Analytics. In VLDB'21. - target to detect domain drift and update corresponding models automatically.

Multi-View Collaboration (epipolar geometry)

[1] Kocabas et al. Self-Supervised Learning of 3D Human Pose using Multi-view Geometry. In CVPR'19.
[2] Yao et al. MONET: Multiview Semi-supervised Keypoint Detection via Epipolar Divergence. In ICCV'19.
[3] Qiu et al. Cross View Fusion for 3D Human Pose Estimation. In ICCV'19.
[4] Brickwedde et al. Mono-SF: Multi-View Geometry Meets Single-View Depth for Monocular Scene Flow Estimation of Dynamic Traffic Scenes. In ICCV'19.
[5] Trinidad et al. Multi-view Image Fusion. In ICCV'19.
[6] Nassar et al. Simultaneous multi-view instance detection with learned geometric soft-constraints. In ICCV'19.

Multi-Camera Collaboration (exploring collaboration in a large camera networks, such as drone networks)

[1] Liu et al. Who2com: Collaborative Perception via Learnable Handshake Communication. In ICRA'20.
[2] Liu et al. When2com: Multi-Agent Perception via Communication Graph Grouping. In CVPR'20.
[3] Tong et al. Large-Scale Vehicle Trajectory Reconstruction with Camera Sensing Network. In MobiCom'21.

Efficient Object Detection (popular in autonomous cars or surveilliance cameras)

[1] Jiwoong et al. Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving. In ICCV'19. Public Code Note
[2] Wang et al. Learning Rich Features at High-Speed for Single-Shot Object Detection. In ICCV'19. Public Code Note

Automatic Labeling (object detection and reID)

[1] H. Aghdam et al. Active Learning for Deep Detection Neural Networks. In ICCV'19. Public Code Note

MTMC tracking (todo)

Deployment

[1] Qiu et al. Kestrel: Video Analytics for Augmented Multi-Camera Vehicle Tracking. In IOTDI'18.
[2] Xu et al. STTR: A System for Tracking All Vehicles All the Time At the Edge of the Network. In DEBS'18.
[3] Gupta et al. FogStore: A Geo-Distributed Key-Value Store Guaranteeing Low Latency for Strongly Consistent Access. In DEBS'18.
[4] Hung et al. Wide-area Analytics with Multiple Resources. In EuroSys'18.
[5] Jiang et al. Networked Cameras Are the New Big Data Clusters. In MobiCom’19 workshop.
[6] Emmons et al. Cracking open the DNN black-box: Video Analytics with DNNs across the Camera-Cloud Boundary. In MobiCom’19 workshop.
[7] Xu et al. Space-Time Vehicle Tracking at the Edge of the Network. In MobiCom’19 workshop.
[8] Xu et al. Coral-Pie: A Geo-Distributed Edge-compute Solution for Space-Time Vehicle Tracking. In Middleware'20.
[9] Li et al. Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics. In SIGCOMM'20.

Algorithms (MTMC Tracking)

[1] Yu et al. The Solution Path Algorithm for Identity-Aware Multi-Object Tracking. In CVPR'16.
[2] Ristani et al. Features for Multi-Target Multi-Camera Tracking and Re-Identification. In CVPR'18.
[3] Feng et al. Challenges on Large Scale Surveillance Video Analysis. In CVPR'18 workshop.

Algorithms (collaborative learning between reID and detection)

[1] Gidaris et al. LocNet: Improving Localization Accuracy for Object Detection. In CVPR'16.
[2] Li et al. Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation. In ECCV'18.
[3] Huang et al. Adversarially Occluded Samples for Person Re-identification. In CVPR'18.
[4] Wang et al. Resource Aware Person Re-identification across Multiple Resolutions. In CVPR'18.
[5] Gong et al. Improving Multi-stage Object Detection via Iterative Proposal Refinement. In BMVC'19.
[6] Luo et al. Detect or Track: Towards Cost-Effective Video Object Detection/Tracking. In AAAI'19.
[7] He et al. Bounding Box Regression with Uncertainty for Accurate Object Detection. In CVPR'19.
[8] Qi et al. A Novel Unsupervised Camera-aware Domain Adaptation Framework for Person Re-identification. In ICCV'19.
[9] Zhu et al. Intra-Camera Supervised Person Re-Identification: A New Benchmark. In ICCV'19 workshop.
[10] Wang et al. Exploit the Connectivity: Multi-Object Tracking with TrackletNet. In MM'19.

Video compression (including video streaming)

[1] Naderiparizi et al. Towards Battery-Free HD Video Streaming. In NSDI’18.
[2] Baig et al. Jigsaw: Robust Live 4K Video Streaming. In MobiCom'19.
[3] Rippel et al. Learned Video Compression. In ICCV'19.
[4] Djelouah et al. Neural Inter-Frame Compression for Video Coding. In ICCV'19.
[5] Habibian et al. Video Compression With Rate-Distortion Autoencoders. In ICCV'19.
[6] Xu et al. Non-Local ConvLSTM for Video Compression Artifact Reduction. In ICCV'19.
[7] Y. Yan et al. Learning in situ: a randomized experiment in video streaming. In NSDI'20.
[8] Kim et al. Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning. In SIGCOMM'20.
[9] Du et al. Server-Driven Video Streaming for Deep Learning Inference. In SIGCOMM'20.
[10] Han et al. ViVo: Visibility-aware Mobile Volumetric Video Streamin. In MobiCom'20.
[11] Zhang et al. SENSEI: Aligning Video Streaming Quality with Dynamic User Sensitivity. In NSDI'21.

Database

[1] Saurez et al. A drop-in middleware for serializable DB clustering across geo-distributed sites. In VLDB'20.

Privacy

Useful external links Keywords
Tutorial on privacy-preserving data analysis (The Alan Turing Institute) todo
The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-21) todo
A Dive into Privacy Preserving Machine Learning (OpML'20) todo
CrypTen (Facebook AI Research) Privacy Preserving Machine Learning framework, PyTorch, Multi-Party Computation (MPC)

[1] (TAMU and Adobe Research) Wu et al. Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study. In ECCV'18.
[2] (CMU) Wang et al. Enabling Live Video Analytics with a Scalable and Privacy-Aware Framework. In 2018 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM'18).
[3] (KAIST, USTC, Rice, NJU, SNU, PKU and MSRA) Lee et al. Occlumency: Privacy-preserving Remote Deep-learning Inference Using SGX. In MobiCom'19.
[4] (NUS) Shen et al. Human-imperceptible Privacy Protection Against Machines. In MM'19.
[5] (PSU and Facebook) Khazbak et al. TargetFinder: Privacy Preserving Target Search through IoT Cameras. In IoTDI'19 (Best Paper Award).
[6] (Tsinghua and USTC) Li et al. Invisible: Federated Learning over Non-Informative Intermediate Updates against Multimedia Privacy Leakages. In MM'20.
[7] (UCB and MSR) Poddar et al. Visor: Privacy-Preserving Video Analytics as a Cloud Service. In 29th Usenix Security Symposium (Security'20).
[8] (ICL, QMUL, Telefónica Research and Samsung AI) Mo et al. DarkneTZ: Towards Model Privacy at the Edge using Trusted Execution Environments. In MobiSys'20.
[9] (NJU, Cornell and MSRA) Wu et al. PECAM: privacy-enhanced video streaming and analytics via securely-reversible transformation. In MobiCom'21.
[10] (ASU) Hu et al. LensCap: Split-Process Framework for Fine-Grained Visual Privacy Control for Augmented Reality Apps. In MobiSys'21.
[11] (CUHK) Ouyang et al. ClusterFL: A Similarity-Aware Federated Learning System for Human Activity Recognition. In MobiSys'21.
[12] (ICL and Telefónica Research) Mo et al. PPFL: Privacy-preserving Federated Learning with Trusted Execution Environments. In MobiSys'21 (Best paper award).
[13] (CMU, UCSD and MSR) Dsouza et al. Amadeus: Scalable, Privacy-Preserving Live Video Analytics. arXiv prePrint 2011.05163.
[14] (MIT, Princeton, UChicago and Rutgers) Cangialosi et al. Privid: Practical, Privacy-Preserving Video Analytics Queries. arXiv prePrint 2106.12083.
[15] read recent privacy-preserving data processing papers published in VLDB and SIGMOD (to find interesting issues on privacy-preserving video processing).

Dataset

  1. Duke MTMC (8 cameras, non-overlapping)
  2. Nvidia CityFlow (>40 cameras, overlapping and non-overlapping)
  3. EPFL WildTrack (7 cameras, overlapping)
  4. EPFL-RLC (3 cameras, overlapping)
  5. CMU Panoptic Dataset (>50 cameras, overlapping)
  6. University of Illinois STREETS (100 cameras, non-overlapping)
  7. Awesome reID dataset

Toolbox

  1. CHUK-mmcv: a foundational python library for computer vision research and supports many research projects (2D/3D detection, semantic segmentation, image and video editing, pose estimation, action understanding and image classification).
  2. JDCV-fastreid: a python library implementing SOTA re-identification methods (including pedestrian and vehicle re-identification). They also provided a good documentation.