GeoDS Lab, Department of Geography, University of Wisconsin-Madison.
Website
·
View Demo
Please refer to https://github.com/GeoDS/COVID19USFlows to access all tools for downloading the datasets. This repository only provides part of the dataset.
If you use this dataset in your research or applications, please cite this source:
Kang, Y., Gao, S., Liang, Y. Li, M., Rao, J. and Kruse, J. Multiscale dynamic human mobility flow dataset in the U.S. during the COVID-19 epidemic. Scientific Data 7, 390 (2020). https://www.nature.com/articles/s41597-020-00734-5
@article{kang2020multiscale,
title = {Multiscale Dynamic Human Mobility Flow Dataset in the U.S. during the COVID-19 Epidemic},
author = {Kang, Yuhao and Gao, Song and Liang, Yunlei and Li, Mingxiao and Kruse, Jake},
journal = {Scientific Data},
volumn = {7},
issue = {390},
pages = {1--13},
year = {2020}
}
Understanding dynamic human mobility changes and spatial interaction patterns at different geographic scales is crucial for monitoring and measuring the impacts of non-pharmaceutical interventions (such as stay-at-home orders) during the pandemic. In this data descriptor, we introduce an up-to-date multiscale dynamic human mobility flow dataset across the United States, with data starting from January 1st, 2019. By analyzing millions of anonymous mobile phone users’ visit trajectories to various places provided by SafeGraph, the daily and weekly dynamic origin-to-destination (O-D) population flows are computed, aggregated, and inferred at three geographic scales: census tract, county, and state. There is high correlation between our mobility flow dataset and openly available data sources, which shows the reliability of the produced data. Such a high spatiotemporal resolution human mobility flow dataset at different geographic scales over time may help monitor epidemic spreading dynamics, inform public health policy, and deepen our understanding of human behavior changes under the unprecedented public health crisis. This up-to-date O-D flow open data can support many other social sensing and transportation applications.
A full description of the methodology used for this study can be found here: https://arxiv.org/abs/2008.12238.
Due to the data size restriction of GitHub, we have splitted our repository into a set of small data repositories. Each data repository follows the same folder structure but only contains part of the dataset. Here are the details about each repository:
Data Repository | Data Type | Scale | Time Range |
---|---|---|---|
COVID19USFlows | index page | --- | 2019-2021 |
COVID19USFlows-WeeklyFlows | weekly data | state, county | 2019-2021 |
COVID19USFlows-WeeklyFlows-Ct2019 | weekly data | census tract | 2019 |
COVID19USFlows-WeeklyFlows-Ct2020 | weekly data | census tract | 2020 |
COVID19USFlows-WeeklyFlows-Ct2021 | weekly data | census tract | 2021 |
COVID19USFlows-DailyFlows | daily data | state, county | 2019-2021 |
COVID19USFlows-DailyFlows-Ct2019-1 | daily data | census tract | 01/2019-04/2019 |
COVID19USFlows-DailyFlows-Ct2019-2 | daily data | census tract | 05/2019-08/2019 |
COVID19USFlows-DailyFlows-Ct2019-3 | daily data | census tract | 09/2019-12/2019 |
COVID19USFlows-DailyFlows-Ct2020-1 | daily data | census tract | 01/2020-04/2020 |
COVID19USFlows-DailyFlows-Ct2020-2 | daily data | census tract | 05/2020-08/2020 |
COVID19USFlows-DailyFlows-Ct2020-3 | daily data | census tract | 09/2020-12/2020 |
COVID19USFlows-DailyFlows-Ct2021 | daily data | census tract | 01/2021-04/2021 |
A description of all attributes in the database is shown below:
geoid_o - Unique identifier of the origin geographic unit (census tract, county, and state). Type: string.
geoid_d - Unique identifier of the destination geographic unit (census tract, county, and state). Type: string.
lat_o - Latitude of the geometric centroid of the origin unit. Type: float.
lng_o - Longitude of the geometric centroid of the origin unit. Type: float.
lat_d - Latitude of the geometric centroid of the destination unit. Type: float.
lng_d - Longitude of the geometric centroid of the destination unit. Type: float.
date_range - Date range of the records. Type: string.
visitor_flows - Estimated number of visitors detected by SafeGraph between the two geographic units (from geoid_o to geoid_d). Type: float.
pop_flows - Estimated population flows between the two geographic units (from geoid_o to geoid_d), inferred from visitor_flows. Type: float.
geoid_o - Unique identifier of the origin geographic unit (census tract, county, and state). Type: string.
geoid_d - Unique identifier of the destination geographic unit (census tract, county, and state). Type: string.
lat_o - Latitude of the geometric centroid of the origin unit. Type: float.
lng_o - Longitude of the geometric centroid of the origin unit. Type: float.
lat_d - Latitude of the geometric centroid of the destination unit. Type: float.
lng_d - Longitude of the geometric centroid of the destination unit. Type: float.
date - Date of the records. Type: string.
visitor_flows - Estimated number of visitors between the two geographic units (from geoid_o to geoid_d). Type: float.
pop_flows - Estimated population flows between the two geographic units (from geoid_o to geoid_d), inferred from visitor_flows. Type: float.
We provide a new dataset that contains flows from other countries to U.S.
geoid_o - Two-letter country codes of the origin country. Type: string.
geoid_d - Unique identifier of the destination geographic unit in the United States (census tract, county, and state). Type: string.
lat_d - Latitude of the geometric centroid of the destination unit. Type: float.
lng_d - Longitude of the geometric centroid of the destination unit. Type: float.
visitor_flows - Estimated number of visitors detected by SafeGraph between the two geographic units (from geoid_o to geoid_d). Type: float.
date_range - Date range of the records. Type: string.
Distributed under the MIT License. See LICENSE
for more information.
Song Gao - @gissong - song.gao at wisc.edu
Yuhao Kang - @YuhaoKang - yuhao.kang at wisc.edu
Project Link: https://github.com/GeoDS/COVID19USFlows
We would like to thank the funding support provided by the National Science Foundation (Award No. BCS-2027375). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Support for this research was partly provided by the University of Wisconsin - Madison Office of the Vice Chancellor for Research and Graduate Education with funding from the Wisconsin Alumni Research Foundation.