/population_gravity

Allocates urban and rural populations for a defined region to a grid

Primary LanguagePythonBSD 2-Clause "Simplified" LicenseBSD-2-Clause

build codecov DOI

population_gravity

A model to downscale urban and rural population for a defined state to a 1 km grid

Overview

The population_gravity model allocates aggregate urban and rural populations for a defined region to grid cells within that region. In the IM3 application, the model is applied to US state-level urban and rural populations, which are allocated to 1 km grid cells within states. Allocation to grid cells is based on the relative suitability of each cell. Suitability is calculated using a gravity-based approach, in which the suitability of a given cell is determined by the population of surrounding cells (100 km radius), their distance away, and two parameters, namely alpha and beta. Alpha and beta are estimated based on historical population data and indicate the importance of returns to scale and distance in determining suitability values of cells, respectively. The model is composed of two components: calibration and projection. The calibration component uses historical urban/rural population grids of each state in 2000 and 2010 and an optimization algorithm to estimate the alpha and beta parameters that minimize the absolute difference between the actual population grid in 2010 and the one derived from the model. The two parameters can be modified to reflect distinctive forms of population development that may be desired in different socio-economic scenarios. Once the parameters are defined, the projection component downscales state-level urban/rural population aggregates of each state from 2020 to 2100 under different scenarios to grid cells within the state.

Associated publications

Zoraghein, H., & O’Neill, B. C. (2020). US State-level Projections of the Spatial Distribution of Population Consistent with Shared Socioeconomic Pathways. Sustainability, 12(8), 3374. https://doi.org/10.3390/su12083374

The input and output data used in this publication can be found here:

Zoraghein, H., & O'Neill, B. (2020). Data Supplement: U.S. state-level projections of the spatial distribution of population consistent with Shared Socioeconomic Pathways. (Version v0.1.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3756179

State-level population projections were described in this publication:

Jiang, L., B.C. O'Neill, H. Zoraghein, and S. Dahlke. 2020. Population scenarios for U.S. states consistent with Shared Socioeconomic Pathways. Environmental Research Letters, https://doi.org/10.1088/1748-9326/aba5b1.

The data produced in Jiang et al. (2020) can be downloaded from here:

Jiang, L., Dahlke, S., Zoraghein, H., & O'Neill, B.C. (2020). Population scenarios for U.S. states consistent with Shared Socioeconomic Pathways (Version v0.1.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3956412

and the state-level model code used in that publication can be found here:

Zoraghein, H., R. Nawrotzki, L. Jiang, and S. Dahlke (2020). IMMM-SFA/statepop: v0.1.0 (Version v0.1.0). Zenodo. http://doi.org/10.5281/zenodo.3956703

Getting Started

The population_gravity package uses only Python 3.6 and up.

Step 1:

You can install population_gravity from GitHub by running the following from your terminal:

python -m pip install -e git://github.com/IMMM-SFA/population_gravity.git@main#egg=population_gravity

Step 2:

Confirm that the module and its dependencies have been installed by running from your prompt:

import population_gravity

If no error is returned then you are ready to go!

Setting up a run

Expected arguments

See examples below for how to pass into the Model class

Argument Type Description
config_file string Full path to configuration YAML file with file name and extension. If not provided by the user, the code will default to the expectation of alternate arguments.
grid_coordinates_file string Full path with file name and extension to the CSV file containing the coordinates for each 1 km grid cell within the target state. File includes a header with the fields XCoord, YCoord, FID.,Where data types and field descriptions are as follows: (XCoord, float, X coordinate in meters),(YCoord, float, Y coordinate in meters),(FID, int, Unique feature id)
historical_suitability_raster string Full path with file name and extension to the suitability raster containing values from 0.0 to 1.0 for each 1 km grid cell representing suitability depending on topographic and land use and land cover characteristics within the target state.
base_rural_pop_raster string Full path with file name and extension to a raster containing rural population counts for each 1 km grid cell for the historical base time step.
base_urban_pop_raster string Full path with file name and extension to a raster containing urban population counts for each 1 km grid cell for the historical base time step.
projected_population_file string Full path with file name and extension to a CSV file containing population projections per year separated into urban and rural categories.,Field descriptions for require fields as follows: (Year, integer, four digit year), (UrbanPop, float, population count for urban), (RuralPop, float, population count for rural), (Scenario, string, scenario as set in the scenario variable)
one_dimension_indices_file string Full path with file name and extension to the text file containing a file structured as a Python list (e.g. [0, 1]) that contains the index of each grid cell when flattened from a 2D array to a 1D array for the target state.
output_directory string Full path with file name and extension to the output directory where outputs and the log file will be written.
alpha_urban float Alpha parameter for urban. Represents the degree to which the population size of surrounding cells translates into the suitability of a focal cell.,A positive value indicates that the larger the population that is located within the 100 km neighborhood, the more suitable the focal cell is.,More negative value implies less suitable. Acceptable range:,-2.0 to 2.0
beta_urban float Beta parameter for urban. Reflects the significance of distance to surrounding cells on the suitability of a focal cell.,Within 100 km, beta determines how distance modifies the effect on suitability. Acceptable range:,-2.0 to 2.0
alpha_rural float Alpha parameter for rural. Represents the degree to which the population size of surrounding cells translates into the suitability of a focal cell.,A positive value indicates that the larger the population that is located within the 100 km neighborhood, the more suitable the focal cell is.,More negative value implies less suitable. Acceptable range:,-2.0 to 2.0
beta_rural float Beta parameter for rural. Reflects the significance of distance to surrounding cells on the suitability of a focal cell.,Within 100 km, beta determines how distance modifies the effect on suitability. Acceptable range:,-2.0 to 2.0
scenario string String representing the scenario with no spaces. Must match what is in the projected_population_file if passing population projections in using a file.
state_name string Target state name with no spaces separated by an underscore.
historic_base_year integer Four digit historic base year.
projection_year integer Four digit first year to process for the projection.
time_step integer Number of steps (e.g. number of years between projections)
rural_pop_proj_n float Rural population projection count for the projected year being calculated. These can be read from the projected_population_file instead.
urban_pop_proj_n float Urban population projection count for the projected year being calculated. These can be read from the projected_population_file instead.
calibration_urban_year_one_raster string Only used for running calibration. Full path with file name and extension to a raster containing urban population counts for each 1 km grid cell for year one of the calibration.
calibration_urban_year_two_raster string Only used for running calibration. Full path with file name and extension to a raster containing urban population counts for each 1 km grid cell for year two of the calibration.
calibration_rural_year_one_raster string Only used for running calibration. Full path with file name and extension to a raster containing rural population counts for each 1 km grid cell for year one of the calibration.
calibration_rural_year_two_raster string Only used for running calibration. Full path with file name and extension to a raster containing rural population counts for each 1 km grid cell for year two of the calibration.
kernel_distance_meters float Distance kernel in meters; default 100,000 meters.
write_csv boolean Optionally export raster as a CSV file without nodata values; option set to compress CSV using gzip. Exports values for non-NODATA grid cells as field name value.
compress_csv boolean Optionally compress CSV file to GZIP if outputting in CSV; Default True
write_raster boolean Optionally export raster output; Default True
write_array2d boolean Optionally export a NumPy 2D array for each output in the shape of the template raster
write_array1d boolean Optionally export a Numpy 1D flattened array of only grid cells within the target state
run_number int Add on for the file name when running sensitivity analysis
output_total boolean Choice to output total (urban + rural) dataset; Defualt True
brute_n_alphas int Number of samples for alphas over the line space when using brute force for pass one
brute_n_betas int Number of samples for betas over the line space when using brute force for pass one
pass_one_alpha_upper float Parameter bounds for the first optimization pass
pass_one_alpha_lower float Parameter bounds for the first optimization pass
pass_one_beta_upper float Parameter bounds for the first optimization pass
pass_one_beta_lower float Parameter bounds for the first optimization pass
pass_two_alpha_upper float Parameter bounds for the seconds optimization pass
pass_two_alpha_lower float Parameter bounds for the seconds optimization pass
pass_two_beta_upper float Parameter bounds for the seconds optimization pass
pass_two_beta_lower float Parameter bounds for the seconds optimization pass

Variable arguments

Users can update variable argument values after model initialization. The following are variable arguments:

  • alpha_urban
  • beta_urban
  • alpha_rural
  • beta_rural
  • urban_pop_proj_n
  • rural_pop_proj_n
  • kernel_distance_meters

YAML configuration file option (e.g., config.yml)

Arguments can be passed into the Model class using a YAML configuration file as well (see Example 1):

Generate calibration parameters

If the calibration has not yet been conducted, follow Example 2 to generate calibration parameters for a target state.

Expected outputs

Each downscaling run will output a raster for urban, rural, and total population count for each 1 km grid cell for the target state. These will be written to where the output_directory has been assigned.

Each calibration run will output a CSV file containing the calibration parameters for the target state and scenario. These will be written to where the output_directory has been assigned.

Examples

Download the original data

Download and unzip the inputs and outputs as archived in Zoraghein and O'Neill (2020) from the following Zenodo archive: zoraghein-oneill_population_gravity_inputs_outputs.zip

Example 1: Run population downscaling for Vermont using year 2010 as the base year to downscale population projections for 2020. Write outputs as GeoTiff files.

from population_gravity import Model

# instantiate model
run = Model(grid_coordinates_file='<Full path with file name and extension to the file>',
            base_rural_pop_raster='<Full path with file name and extension to the file>',
            base_urban_pop_raster='<Full path with file name and extension to the file>',
            historical_suitability_raster='<Full path with file name and extension to the file>',
            projected_population_file='<Full path with file name and extension to the file>',
            one_dimension_indices_file='<Full path with file name and extension to the file>',
            output_directory='<Full path to the desired directory>',
            alpha_urban=alpha_urban,
            alpha_rural=alpha_rural,
            beta_urban=beta_urban,
            beta_rural=beta_rural,
            kernel_distance_meters=kernel_distance_meters,
            scenario=scenario,
            state_name=target_state,
            historic_base_year=historical_year,
            projection_year=projection_year,
            write_raster=write_raster,
            write_logfile=write_logfile,
            output_total=output_total,
            write_array1d=write_array1d,
            run_number=sample_id)

run.downscale()

Example 3: Calibrate downscaling parameters for a target state via script

from population_gravity import Model


run = Model(grid_coordinates_file='<Full path with file name and extension to the file>',
            historical_suitability_raster='<Full path with file name and extension to the file>',
            one_dimension_indices_file='<Full path with file name and extension to the file>',
            output_directory='<Full path with file name and extension to the file>',
            kernel_distance_meters=100000,
            state_name='vermont',
            scenario='SSP2',

            # calibration specific entries
            calibration_urban_year_one_raster='<Full path with file name and extension to the file>',
            calibration_urban_year_two_raster='<Full path with file name and extension to the file>',
            calibration_rural_year_one_raster='<Full path with file name and extension to the file>',
            calibration_rural_year_two_raster='<Full path with file name and extension to the file>',

            # number of samples for alphas and betas over the line space when using brute force for pass one
            brute_n_alphas=10,
            brute_n_betas=5,

            # parameter bounds for the first optimization pass
            pass_one_alpha_upper=1.0,
            pass_one_alpha_lower=-1.0,
            pass_one_beta_upper=1.0,
            pass_one_beta_lower=0.0,

            # parameter bounds for the seconds optimization pass
            pass_two_alpha_upper=2.0,
            pass_two_alpha_lower=-2.0,
            pass_two_beta_upper=2.0,
            pass_two_beta_lower=-0.5)

run.calibrate()