/vancom-ubc-dssg

Primary LanguageJupyter Notebook

Unbiased Mobility Data Project

Hosted by: The Data Science for Social Good (DSSG) program at the University of British Columbia

Sponsor: Cedar Academy Society - VanCom Project

Time: May 2021 - August 2021

Description: Public sector and academic communities have been using mobility and traffic data as a proxy measurement for a variety of social topics, from GDP prediction and economic development to greenhouse gas emissions and environmental impact. One method to measure mobility and acquire traffic data is through the analysis of pictures and footage from traffic cameras installed at fixed locations (in urban and rural areas). More often than not, the cameras are installed near locations with heavy traffic, and this introduces sampling bias in the observed data. This leads to a biased dataset and overexaggerates nearby mobility levels due to “preferential sampling.” This project seeks to correct this preferential sampling and develop an algorithm to better model mobility levels while accounting for the bias in the data set.

Data:

Request for download link at data@pwfh.org, for data items with this in red

Click to download a high-level VanCom Mobility Data Users' Guide.

Data from Cedar Academy Society:

  1. in red Main Dataset

    Time: December 1-31, 2020

    Assets: 364 location-based Assets in the City of Surrey, British Columbia, Canada

    map

    Description: The file contains full-YOLOv3-extracted information from raw static image files. For that reason, please ignore the "MOV index" part (this is for clients who need aggregated and scaled data) in the Users' Guide.

  • Data Description file screenshotData description:
  • Data file screenshotData file:
  1. in red 60 min-gap raw images

    JPEG files from two camera stations have been pulled and stored below for your reference

    Time: December 1-31, 2020

    Assets: 2 location-based Assets in the City of Surrey

    • 104 Ave And 140 St
    • 104 Ave And City Hall Driveway
104 Ave And 140 St(station name: enc_104_140_cam1) 104 Ave And City Hall Driveway(station name: enc_104egress_cityhall_cam1)
  1. in red 15 min-gap extract data

    Time: October 10-16, 2020

    Assets: 364 location-based Assets in the City of Surrey, British Columbia, Canada

  2. in red 2 min-gap raw images

    Time: May 2-3, 2020

    Asset: 1 location-based Asset: 104 Ave And 140 St

Data and tools from other sources:

### Sample Code: ##### Requirements * os * datetime * pandas * json * time * numpy * geopandas ##### [coming soon] ....

Potential Directions:

Applications
  1. Traffic prediction. A survey can be found here [9]

  2. Mobility data in Real Estate context and Canadian Economics Society Annual Meeting 2021 presentation

  3. GHG emission indexing

    The idea is to project vehicle type/make recognition to Green House Gas emission. A starting point can be car recognition mechanism from image/video files, e.g., here on Github

    and a Standford car dataset here.

    The challenge in this case is the low resolution of raw image/video files that makes ID of vehicles logo and shape of headlights impossible.

  4. Crime Prevention [7]

    The idea is a Minority Report kind of mechanism

  5. Economic Recovery post Pandemic [4]

    Mobility and Engagement Index by the Federal Reserve Bank of Dallas

    The Impact of COVID-19 on Small Business Dynamics and Employment

    Bloomberg article: High-Frequency Data Prove Their Staying Power With Fed’s Buy-In

  6. Weather influence on Mobility

  7. Bike routes monitoring

Cam and Shared Traffic Cam and Bike Lanes
Infrastructure
  1. Bias correction

  2. Metadata

  3. Better Object Detection

    Linux, inference with detect.py

    # clone github repo
     git clone https://github.com/ultralytics/yolov5.git
     
     # clear requirements and download pretrained model weights
     cd yolov5
     
     pip install -r requirements.txt
     
     python detect.py --weights yolov5s.pt
     python detect.py --weights yolov5m.pt
     python detect.py --weights yolov5l.pt
     python detect.py --weights yolov5x.pt
     
     # Try Yolov5 on static image files
     
     python detect.py --source 2020-12-06-23-22-42-enc_104_140_cam1.jpg --weights yolov5s.pt --project infer_yolov5s_2
     
     python detect.py --source 2020-12-06-23-22-42-enc_104_140_cam1.jpg --weights yolov5x.pt --project infer_yolov5s_3
    
    

    AutoNue Challenge 1st Prize Winner at CVPR 2021

  4. Edge Computing

  5. Hardware

  6. Scalable Object Detection Pipeline

  7. PaddleDetection

Recent Update:

Related Projects, Datasets, and Repositories:

  1. NeurIPS 2021’s competition Traffic4cast
  2. Kaggle's Android smartphones high accuracy GNSS datasets
  3. Standford car (type/make) dataset here
  4. UA-DETRAC dataset (detection/tracking), with research paper here [8]
  5. CVOnline - Compendium of Computer Vision
  6. NVidia AI CITY CHALLENGE

References:

[1] Understanding Traffic Density from Large-Scale Web Camera Data, Shanghang Zhang, Guanhang Wu, João P. Costeira, José M. F. Moura, arXiv:1703.05868 [cs.CV]

[2] A general theory for preferential sampling in environmental networks, Watson, J, V. Zidek, J, Shaddick, G, Annals of Applied Statistics, 2019, 2662-2700

[3] Object Counting on Low Quality Images: A Case Study of Near Real-Time Traffic Monitoring, Jean-Francois Rajotte, Martin Sotir, Cedric Noiseux, Louis-Philippe Noel, Thomas Bertiere, IEEE Xplore: 17, January 2019

[4] Mobility and Engagement Following the SARS-Cov-2 Outbreak, the Federal Reserve Bank of Dallas

[5] Leveraging Administrative Data for Bias Audits: Assessing Disparate Coverage with Mobility Data for COVID-19 Policy, Amanda Coston, Neel Guha, Derek Ouyang, Lisa Lu, Alexandra Chouldechova, Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. pp. 173-184, arXiv:2011.07194 [stat.AP]

[6] STREETS: A Novel Camera Network Dataset for Traffic Flow, Advances in Neural Information Processing Systems 32 (NeurIPS 2019)

[7] Leveraging Mobility Flows from Location Technology Platforms to Test Crime Pattern Theory in Large Cities, Cristina Kadar, Stefan Feuerriegel, Anastasios Noulas, Cecilia Mascolo, arXiv:2004.08263 [cs.CY]

[8] UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking, Longyin Wen, Dawei Du, Zhaowei Cai, Zhen Lei, Ming-Ching Chang, Honggang Qi, Jongwoo Lim, Ming-Hsuan Yang, Siwei Lyu, 2020, arXiv:1511.04136 [cs.CV]

[9] Urban flows prediction from spatial-temporal data using machine learning: A survey, arXiv:1908.10218 [cs.LG]