Demeter

A land-use and land-cover disaggregation and change detection model

Current release

Overview

Demeter is an open source Python package that was built to disaggregate projections of future land allocations generated by an integrated human-Earth systems model. Projected land allocation from IAMs is traditionally transferred to Earth System Models (ESMs) in a variety of gridded formats and spatial resolutions as inputs for simulating biophysical and biogeochemical fluxes. Existing tools for performing this translation generally require a number of manual steps which introduces error and is inefficient. Demeter makes this process seamless and repeatable by providing gridded land use and land cover change (LULCC) products derived directly from an IAM—in this case, the Global Change Analysis Model (GCAM)—in a variety of formats and resolutions commonly used by ESMs. Demeter is publicly available via GitHub and has an extensible output module allowing for future ESM needs to be easily accommodated.

Installation

The following step will get Demeter ready to use:

This repository uses the Git Large File Storage (LFS) extension (see https://git-lfs.github.com/ for details). Please run the following command before cloning if you do not already have Git LFS installed: git lfs install
Clone Demeter into your desired location git clone https://github.com/IMMM-SFA/demeter.git`
From the directory you cloned Demeter into run python setup.py install . This will install Demeter as a Python package on your machine and install of the needed dependencies.
Setup your configuration file (.ini). Examples are located in the "example" directory. Be sure to change the root directory to the directory that holds your data (use the 'demeter/example' directory as an example).
If running Demeter from an IDE: Be sure to include the path to your config file. See the "demeter/example/example.py" script as a reference.
If running Demeter from terminal: Run model.py found in demeter/demeter/model.py passing the full path to the config file as the only argument. (e.g., python model.py /users/ladmin/repos/github/demeter/example/config.ini)

If a permissions error is encountered either run the command sudo or on Windows open cmd as an administrator.

Setup

Demeter requires the setup of several input files to begin a run. Examples of all input files can be found in the ‘examples’ directory and the expected file structure is outlined in the following:

Example directory
- Inputs directory
  - Allocation directory
    - Constraint weighting file
      - GCAM landclass allocation file
      - Kernel density weighting file
      - Spatial landclass allocation file
      - Transition priority file
      - Treatment order file
  - Observed spatial data directory
    - Observed spatial data file
  - Constraint data directory
    - Constraint files
  - Projected GCAM land allocation directory
    - GCAM land allocation file
  - Reference data directory
    - Reference files

The following describes the requirements and format of each input.

Observed spatial data:

This file represents the area in square degrees of each land class existing within a grid cell. The grid cell size is defined by the user. This file must be presented as a comma-separated values (CSV) file having a header in the first row and must contain the field names and fields described in Table 1.

Field	Description
fid	Unique integer ID for each grid cell latitude, longitude
landclass	Each land class field name (e.g., shrub, grass, etc.). Field names must not include commas.
region_id	The integer ID of the GCAM region that the grid cell is contained in. Exact field name spelling required.
metric_id	The integer ID of the GCAM AEZ or basin that the grid cell is contained in. Exact field name spelling required.
latitude	The geographic latitude value of the grid cell centroid as a signed float. Exact field name spelling required.
longitude	The geographic longitude value of the grid cell centroid as a signed float. Exact field name spelling required.

Table 1. Observed spatial data required fields and their descriptions.

Projected land allocation data:

This file is the formatted GCAM run output for land allocation projections. Since the format of this file can vary based on GCAM user preference, the file must be formatted to meet Demeter input requirements as described in Table 2. The file must be a CSV file having the header in the first row.

Field	Description
region	The text name of the GCAM region. Exact field name spelling required.
landclass	Each land class field name (e.g., shrub, grass, etc.). Field names must not include commas.
year	Each year of the GCAM run as an integer (e.g., 2005, 2010, etc.)
metric_id	The integer ID of the GCAM AEZ or basin. Exact field name spelling required.

Table 2. Projected land allocation required fields from GCAM.

Allocation files:

Constraint weighting:

A weight for each constraint, with a value ranging from -1.0 to 1.0, can be applied to each land class. If no constraints are desired, a user should simply provide a header-only file. For example, for a given land class, the weight for the soil quality constraint with a value of -1 indicates that one land class is fully constrained inversely (e.g., grasslands are opportunistic and grow readily in areas with a low soil quality); a weight of 0 indicates that soil quality exerts no constraint to the land class (e.g., forest, etc.); a weight of 1 for soil quality indicates that high soil quality will highly influence where the type will be spatially allocated (e.g. cropland). These constraints are developed in separate files as described in the following Constraints section. See the constraint weighting file in the example inputs for reference.

Kernel density weighting:

Weight the degree to which land classes subjected to a kernel density filter will be utilized during expansion to each class. Value from 0.0 to 1.0. See the kernel density weighting file in the example inputs for reference.

Transition Priority:

This ordering defines the preferential order of final land allocation (e.g., crops expanding into grasslands rather than forests). See the priority allocation file in the example inputs for reference. See the priority allocation file in the example inputs for reference.

Treatment order:

Defines the order in which final land classes are downscaled. This will influence the results (e.g., if crops are downscaled first and overtake grassland, grassland will not be available for shrubs to overtake when processing shrub land). See the treatment order file in the example inputs for reference.

Observational spatial data class allocation:

This file defines how the land-use and land-cover classes in the OSD will be binned into final land classes for output, which can be defined by the user and serve to place projected land allocation data from GCAM on a common scale with the on-the-ground representation of land-use and land-cover represented in the OSD. See the Observed spatial data class allocation file in the example inputs for reference.

Projected land class allocation:

This file defines how the land-use and land-cover classes in the GCAM projected land allocation data will be binned into final classes. See the projected land class allocation file in the example inputs for reference.

Constraints (not required):

As discussed earlier, constraints such as soil quality may be desirable to the user and can be prepared by assigning a weighted value from 0.0 to 1.0 for each grid cell in the OSD. Spatial maps of constraints should be provided by the user for application during the downscaling process. Users should note that constraining a grid cell to 0.0 may impede the ability to be able to achieve a projected land allocation from GCAM since land area is being excluded that GCAM expects. Each constraint file must have two fields: fid and weight. The fid field should correspond to the fid field in the OSD input and the weight field should be the weight of the constraint per the cell corresponding to the OSD input. Each file should be a CSV with no header.

Configuration file:

Demeter’s configuration file allows the user to customize each run and define where file inputs are and outputs will be. The configuration options and hierarchical level are described in Table 3.

Level	Parameter	Description
STRUCTURE	root_dir	The full path of the root directory where the inputs and outputs directory are stored
STRUCTURE	in_dir	The name of the input directory
STRUCTURE	out_dir	The name of the output directory
INPUTS	allocation_dir	The name of the directory that holds the allocation files
INPUTS	observed_dir	The name of the directory that holds the observed spatial data file
INPUTS	constraints_dir	The name of the directory that holds the constraints files
INPUTS	projected_dir	The name of the directory that holds the GCAM projected land allocation file
INPUTS	ref_dir	The name of the directory that holds the reference files
INPUTS - ALLOCATION	spatial_allocation	The file name with extension of the observed spatial data class allocation
INPUTS - ALLOCATION	gcam_allocation	The file name with extension of the projected land class allocation
INPUTS - ALLOCATION	kernel_allocation	The file name with extension of the kernel density weighting
INPUTS - ALLOCATION	priority_allocation	The file name with extension of the priority allocation
INPUTS - ALLOCATION	treatment_order	The file name with extension of the treatment order
INPUTS - ALLOCATION	constraints	The file name with extension of the constraint weighting
INPUTS - OBSERVED	observed_lu_data	The file name with extension of the observational spatial data
INPUTS - PROJECTED	projected_lu_data	The file name with extension of the projected land allocation data from GCAM
INPUTS - REFERENCE	gcam_regnamefile	The file name with extension of the GCAM region name to region id lookup
INPUTS - REFERENCE	region_coords	A CSV file of GCAM region coordinates for each grid cell
INPUTS - REFERENCE	country_coords	A CSV file of GCAM country coordinates for each grid cell
OUTPUTS	diag_dir	The name of the directory that diagnostics outputs will be kept
OUTPUTS	log_dir	The name of the directory that the log file outputs will be kept
OUTPUTS	kernel_map_dir	The name of the directory that kernel density map outputs will be kept
OUTPUTS	transition_tabular	The name of the directory that tabular land transition outputs will be kept
OUTPUTS	transition_maps	The name of the directory that land transition map outputs will be kept
OUTPUTS	luc_intense_p1_dir	The name of the directory that the land intensification first pass map outputs will be kept
OUTPUTS	luc_intense_p2_dir	The name of the directory that the land intensification second pass map outputs will be kept
OUTPUTS	luc_expand_dir	The name of the directory that the land expansion map outputs will be kept
OUTPUTS	luc_ts_luc	The name of the directory that the land use change per time step map outputs will be kept
OUTPUTS	lc_per_step_csv	The name of the directory that the tabular land change per time step outputs will be kept
OUTPUTS	lc_per_step_nc	The name of the directory that the NetCDF land change per time step outputs will be kept
OUTPUTS	lc_per_step_shp	The name of the directory that the Shapefile land change per time step outputs will be kept
OUTPUTS - DIAGNOSTICS	harm_coeff	The file name with extension of the NumPy array that will hold the harmonization coefficient data
OUTPUTS - DIAGNOSTICS	intense_pass1_diag	The file name with extension of the CSV that will hold the land allocation per time step per functional type for the first pass of intensification
OUTPUTS - DIAGNOSTICS	intense_pass2_diag	The file name with extension of the CSV that will hold the land allocation per time step per functional type for the second pass of intensification
OUTPUTS - DIAGNOSTICS	expansion_diag	The file name with extension of the CSV that will hold the land allocation per time step per functional type for the expansion pass
PARAMS	model	The model name providing the projected land allocation data (e.g., GCAM)
PARAMS	metric	Subregion type (either AEZ or BASIN)
PARAMS	scenario	Scenario name
PARAMS	run_desc	The description of the current run
PARAMS	agg_level	1 if only by metric, 2 if by region and metric
PARAMS	observed_id_field	Observed spatial data unique field name (e.g. fid)
PARAMS	start_year	First time step to process (e.g., 2005)
PARAMS	end_year	Last time step to process (e.g., 2100)
PARAMS	use_constraints	1 to use constraints, 0 to ignore constraints
PARAMS	spatial_resolution	Spatial resolution of the observed spatial data in decimal degrees (e.g. 0.25)
PARAMS	errortol	Allowable error tolerance in square kilometres for non-accomplished change
PARAMS	timestep	Time step interval (e.g., 5 years) for the output data. This time step is the increment that Demeter will process when starting with the base year.
PARAMS	proj_factor	Factor to multiply the projected land allocation by
PARAMS	diagnostic	0 to not output diagnostics, 1 to output
PARAMS	intensification_ratio	Ideal fraction of land change that will occur during intensification. The remainder will be through expansion. Value from 0.0 to 1.0.
PARAMS	stochastic_expansion	0 to not conduct stochastic expansion of grid cells, 1 to conduct
PARAMS	selection_threshold	Threshold above which grid cells are selected to receive expansion for a target functional type from the kernel density filter. Value from 0.0 to 1.0; where 0 lets all land cells receive expansion and 1 only lets only the grid cells with the maximum likelihood expand.
PARAMS	kernel_distance	Radius in grid cells used to build the kernel density convolution filter used during expansion
PARAMS	map_kernels	0 to not map kernel density, 1 to map
PARAMS	map_luc_pft	0 to not map land change per land class per time step, 1 to map
PARAMS	map_luc_steps	0 to not map land change per time step per land class for intensification and expansion, 1 to map
PARAMS	map_transitions	0 to not map land transitions, 1 to map
PARAMS	target_years_output	Years to save data for; default is ‘all’; otherwise a semicolon delimited string (e.g., 2005; 2020)
PARAMS	save_tabular	Save tabular spatial land cover as a CSV; define tabular units in tabular_units param
PARAMS	tabular_units	Units to output the spatial land cover data in; either ‘sqkm’ or 'fraction'
PARAMS	save_transitions	0 to not write CSV files for each land transitions per land type, 1 to write
PARAMS	save_shapefile	0 to not write a Shapefile for each time step containing for all functional types, 1 to write; output units will be same as tabular data
PARAMS	save_netcdf_yr	0 to not write a NetCDF file for each year of the fraction of land cover of each land class by grid cell; 1 to write
PARAMS	save_netcdf_lc	0 to not write a NetCDF file for each land class of the fraction of land cover it takes up over all years by grid cell; 1 to write
PARAMS	save_ascii_max	0 to not create an ASCII raster of the land class with the maximum area for each grid cell per 1 to write
ENSEMBLE	permutations	If running an ensemble of configurations, this is the number of permutations to process
ENSEMBLE	limits_file	If running an ensemble of configurations, this is the full path to a CSV file containing limits to generate ensembles of certain parameters.
ENSEMBLE	n_jobs	If running an ensemble of configurations, this is the number of CPU’s to spread the parallel processing over. -1 is all, -2 is all but one, 4 is four, etc.

Table 3. Configuration file hierarchy, parameters, and descriptions.

Workflow

Figure 1 details the Demeter’s workflow once the input files have been prepared. The process can be outlined as follows:

Configuration file read and input parameters are validated
Input data is read and validated
Grid area discrepancies between the OSD and the projected land allocation from GCAM are harmonized by adjusting the GCAM allocation areas using a correction factor (ratio of the OSD land use data per region and metric to the projected area from GCAM)
Constraints are processed if provided and prepared for integration
The convolution filter that will be used to calculate kernel density is prepared. This is applied during each time step.
Initial and time step specific arrays are prepared
An initial pass to intensify existing land is conducted
An expansion pass is conducted to apply land allocation projections where the kernel density probabilities meet the user-specified selection threshold.
A final intensification pass is conducted to allocate any land projections not yet met.
Output products are created.

Figure 1. Demeter workflow diagram.

Execution

Demeter has two main model level functions available to the user: execute() and ensemble(). The execute function allows the user to run Demeter based on the parameters defined in the configuration file and as set up in the input files. This is the most common use of Demeter. The ensemble function was built to allow the user to test Demeter using random configurations of the transition priorities, treatment order, intensification ratio, selection threshold, and kernel distance. The ensemble section of the configuration file as seen in the example directory give the user the ability to define the number of permutations that they wish to evaluate as well as a limits file that gives the user the ability to set limits and the interval at which intensification ratio, selection threshold, and kernel distance will be evaluated. The initial state of the configuration file and input files are used to create a template by which every unique combination of the aforementioned parameters can be generated. A uniquely random sample of these configurations are then chosen based upon how many permutations the user wishes to evaluate. Each Demeter run is then executed in parallel based on the number of jobs the user has set.

Installation

Installing Demeter can be conducted as follows:

This repository uses the Git Large File Storage (LFS) extension (see https://git-lfs.github.com/ for details). Please run the following command before cloning if you do not already have Git LFS installed: git lfs install
Clone Demeter into your desired location git clone https://github.com/IMMM-SFA/demeter.git`
Install the Python package setuptools (https://pypi.python.org/pypi/setuptools) if the package is not already on your machine.
From the directory you cloned Demeter into and your setup.py file exists run python setup.py install which will install Demeter as a Python package on your machine and install of the required dependencies.

Run Preparation

Prepare Demeter for a run as follows:

Setup your configuration file (.ini). Examples are located in the "example" directory. Be sure to change the root directory to the directory that holds your data (use the 'demeter/example' directory as an example).
If running Demeter from an IDE: Be sure to include the path to your config file. See the "demeter/example/example.py" script as a reference.
If running Demeter from terminal: Run model.py found in demeter/demeter/model.py passing the full path using forward slashes and having no spaces to the configuration file as the first argument and either “standard” or “ensemble” as the second argument (e.g., python model.py /users//github/demeter/example/config.ini standard)

Both the output directory and log file generated during runtime are named per the run scenario and the timestamp of when the run began. An additional suffix of the permutation number will also be added if using the ensemble function in Demeter. All other outputs directories are named according to what the user specifies in the configuration file.

Quality control

All possible combinations of the configuration file and input files have been tested. Strict requirements for input files are documented. Configuration options have set limits, values, and types and are validated during runtime. An example setup for Demeter is included in the package. The example has a configuration file and an input directory containing all other necessary files. To setup the provided example, the user should simply change the file paths in the configuration file and the example.py file to represent their local directories.

Demeter outputs the following diagnostic files that help the user to evaluate the outputs based on how they were processed during runtime:

Harmonization coefficient NumPy array
This file contains the correction factor that was applied to each grid cell of the projected land allocation data to ensure the projected area is the same as the area actually available as represented by the OSD for the target base year
Files for each intensification and expansion pass detailing the land change per year per grid cell from one landclass to its transition class by region and metric
A detailed log file
A suite of tabular, spatial, and mapped data.

mengqi-z/demeter

Demeter

A land-use and land-cover disaggregation and change detection model

Current release

Overview

Installation

Setup

Observed spatial data:

Projected land allocation data:

Allocation files:

Constraint weighting:

Kernel density weighting:

Transition Priority:

Treatment order:

Observational spatial data class allocation:

Projected land class allocation:

Constraints (not required):

Configuration file:

Workflow

Execution

Installation

Run Preparation

Quality control