/paper-yadkin-swat-study

Data repository for paper titled: "Assessment of Hydrologic Vulnerability to Urbanization and Climate Change in a Rapidly Changing Watershed in the Southeast U.S."

Primary LanguageR

paper-yadkin-swat-study

Data repository for paper titled: 'Assessment of Hydrologic Vulnerability to Urbanization and Climate Change in a Rapidly Changing Watershed in the Southeast U.S.'

DOI

This README.md file was generated on 20180703 by Sheila Saia.

This GitHub repository was created to provide access to collected data, analysis code, and other information associated with the paper by Suttles et al. titled 'Assessment of Hydrologic Vulnerability to Urbanization and Climate Change in a Rapidly Changing Watershed in the Southeast U.S.' in Science of the Total Environment (https://doi.org/10.1016/j.scitotenv.2018.06.287).

General Information

Title of Dataset
"paper-yadkin-swat-study"

Dataset & Repo Contact Information
Name: Sheila Saia
Institution: United States Forest Service, Center for Integrated Forest Science
Address: NC State University, Partners 2, 840 Main Campus Drive, Raleigh, NC 27606
Email: ssaia at ncsu dot edu

Study Contact Information
Name: Kelly Suttles
Institution: United States Forest Service, Center for Integrated Forest Science
Address: NC State University, Partners 2, 840 Main Campus Drive, Raleigh, NC 27606
Email: ksuttle at ncsu dot edu

Date of data collection
SWAT model outputs were generated in 2016. United States Forest Service landuse predictions were generated in 2015. All other data originated from publically available sites as described in the associated paper.

Geographic location of data collection
All data is associated with the Upper Yadkin-Pee Dee Watershed in Western North Carolina, USA.

Information about funding sources that supported the collection of the data
Kelly Suttles and Sheila Saia were supported by funding through the Oak Ridge Institute for Science and Education (ORISE).

Sharing & Access Information

Licenses/restrictions placed on the data
Please use and distribute according to CC-BY v4.0. For a human readible version of this license visit https://creativecommons.org/licenses/by/4.0/.

Links to publications that cite or use the data
SWAT simulated streamflow data was also used by Saia et al. (2019).

Links to other publicly accessible locations of the data
This dataset and associated R code are available at https://github.com/sheilasaia/paper-yadkin-swat-study and via Zenodo (https://zenodo.org/record/1312628). The associated publication is available via Science of the Total Environment (https://doi.org/10.1016/j.scitotenv.2018.06.287).

Links/relationships to ancillary data sets
All links to publically available data is described here and in Suttles et al. (2018). With respect to simulated data and data analysis scripts, there is data is also linked to the study dataset explained in Saia et al. (2019).

Data derived from another source
All links to publically available data is described here and in Suttles et al. (2018). With respect to simulated data and data analysis scripts, this is the only source of these data.

Additional related data collected that was not included in the current data package
This directory does not include publically available soils, digital elevation data, and reservoir data required to run SWAT. For more information on these data see Suttles et al. (2018) or contact Kelly Suttles directly.

Are there multiple versions of the dataset?
All publically available data is described here and in Suttles et al. (2018). With respect to simulated data and data analysis scripts, there are no other versions available online.

Recommended citation for the data
Suttles, K.M., N. K. Singh, J.M. Vose, K.L. Martin, R.E. Emanuel, J.W. Coulston, S.M. Saia, and M.T. Crump. 2018. Assessment of Hydrologic Vulnerability to Urbanization and Climate Change in a Rapidly Changing Watershed in the Southeast U.S. Science of the Total Environment. 645:806-816.

Paper Availability
The paper is available online at via Science of the Total Environment and Treesearch. If you do not have a subscription to the journal or are having trouble accessing it, please contact Sheila Saia or Kelly Suttles directly for a copy of the pre-print.

Data & File Overview

This repository is organized into three main directories: observed_data, simulated_data, and analysis_scripts.

1. observed_data directory

The observed_data directory contains all (historic) observed weather, land cover, and streamflow data used in this study that required pre-processing or were needed for analysis. It also includes digital elevation model (DEM) data, soil data, and reservoir data required to run SWAT. These data were all collected from public databases as explained by Suttles et al. (2018) but are included for convenience. The observed_data directory includes six subdirectories: weather, landcover_1992, streamflow, dem_mosaic, reservoirs, and soils.

1.1 weather subdirectory

Directory name: weather
Short description: This subdirectory contains the observed weather data text files required to run SWAT for the 1979-2008 period. It also includes associated .shp files (within the weather_stations_shp directory) to display each weather stations spatially. [include details on sim_baseline_pcp.xlsx and swat_precip_summary_outlet_1982-2002.xlsx files] See README file inside this subdirectory for further details on its contents.

File List
Filename: *.txt files
Short description: These text files include observed daily precipitation, temperature, solar radiation, relative humidity, and wind speed for the study watershed. These data are all formatted based on SWAT requirements. See README file inside this subdirectory for further details on each the naming scheme and contents of each text file.

Filename: weather_stations_shp.xlsx
Short description: This directory includes files associated with the spatial distribution of the 18 weather stations within the watershed that were used in SWAT.

Relationship Between Files
The text files listed above are all required for running in SWAT. Please see README inside this subdirectory for more details on these files.

Raw Data
This subdirectory does not contain any raw data because everything in it was automatically formatted for use with SWAT when it was downloaded from publically available sites.

1.2 landcover_1992 subdirectory

Directory name: landcover_1992
Short description: This subdirectory contains the observed landcover data text files required to run SWAT for the 1979-2008 period. It includes the nlcd_1992_raw and nlcd_1992_projected directories which contain all the files needed to open associated raw and projected 1992 land cover data in ArcGIS, respectively. See README file inside this subdirectory for further details on the source of its contents.

File List
Filename: LC9276457208.tif files
Short description: Within the nlcd_1992_raw, this file represents the raw (unprojected) 1992 National Land Cover Dataset (NLCD) data as described in the README within this folder. The projected NLCD data is required to run SWAT.

Filename: [projected]
Short description: Within the nlcd_1992_projected, this file represents the projected 1992 National Land Cover Dataset (NLCD) data described in the README within this folder.

Relationship Between Files
The files in the nlcd_1992_projected subdirectory are the projected version of the files in the nlcd_1992_raw folder.

Raw Data
Raw 1992 NLCD data can be found in the nlcd_1992_raw subdirectory.

1.3 streamflow subdirectory

Directory name: streamflow
Short description: This subdirectory contains the observed daily streamflow data for three USGS gages used in this study: (1) Yadkin River in Enon, NC (USGS gage #02115360), (2) Yadkin River at Yadkin College (USGS gage #02116500), and (3) Pee Dee River (USGS gage #02129000). Raw data for these three USGS gages is stored in the raw directory and a column converting these data from cubic feet per second (cfs) to cubic meters per second (cms) has added to files in the cms_conversions directory. See README file inside this subdirectory for further details on the source of its contents.

raw Directory File List
Filename: USGS_02115360_yadkin_enon_raw.xlsx
Short description: Daily streamflow in cfs units for the USGS Yadkin River at Enon, NC station. See README within the main streamflow directory for information on the source of these data and the header of the file for metadata.

Filename: USGS_02116500_yadkin_college_raw.xlsx
Short description: Daily streamflow in cfs units for the USGS Yadkin River at Yadkin College station. See README within the main streamflow directory for information on the source of these data and the header of the file for metadata.

Filename: USGS_02129000_pee_dee_raw.xlsx
Short description: Daily streamflow in cfs units for the USGS Pee Dee River station. See README within the main streamflow directory for information on the source of these data and the header of the file for metadata.

cms_conversions Directory File List
Filename: USGS_02115360_yadkin_enon.xlsx
Short description: Originates from the USGS_02115360_yadkin_enon_raw.xlsx file but has an extra column for flow in cms units.

Filename: USGS_02116500_yadkin_college.xlsx
Short description: Originates from the USGS_02116500_yadkin_college_raw.xlsx file but has an extra column for flow in cms units.

Filename: USGS_02129000_pee_dee.xlsx
Short description: Originates from the USGS_02129000_yadkin_pee_dee_raw.xlsx file but has an extra column for flow in cms units.

Relationship Between Files
The files in the cms_conversion subdirectory originate from files with a similar name in the raw subdirectory.

Raw Data
All raw streamflow data can be found in the raw subdirectory.

1.4 dem_mosaic subdirectory

Directory name: dem_mosaic
Short description: This directory includes digital elevation model (DEM), file called dem_mosaic.tif that is required to run SWAT. See README file inside this subdirectory for further details on the source of its contents.

Relationship Between Files
There is only one geospatial data file in this folder.

Raw Data
There is no raw data included in this subdirectory.

1.5 reservoirs subdirectory

Directory name: reservoirs
Short description: This directory includes the geospatial data files associated with the reservoirs.shp file and the reservoirs_table.xlsx file. See README file inside this subdirectory for further details on the source of its contents.

Relationship Between Files
Additional attributes required to add the reservoirs in the reservoirs.shp file to SWAT are included in the reservoirs_table.xlsx file

Raw Data
There is no raw data included in this subdirectory.

1.6 soils subdirectory

Directory name: soils
Short description: This directory includes the geospatial data files associated with the soil.shp file. See README file inside this subdirectory for further details on the source of its contents.

Relationship Between Files
There is only one geospatial data file in this folder.

Raw Data
There is no raw data included in this subdirectory.

2. simulated_data directory

The simulated_data directory contains simulated climate data (climate subdirectory), simulated land use from 2060 (landuse_2060 subdirectory), and SWAT outputs (swat_outputs subdirectory).

2.1 climate subdirectory

Directory name: climate
Short description:

backcast_climate_1982-2002 Directory File List
Directory name: backcast_climate_1982-2002
Short description: This subdirectory includes the backcasted climate files required to run SWAT for CSIRO, Hadley, and MIROC climate model simulations. See similarly named directory for each corresponding climate model. See Suttles et al. 2018 for reasoning on why backcasted data is required. See README file inside this subdirectory for further details on the source of its contents.

Filename: *.txt files within each of the csiro, hadley, and miroc directories
Short description: These text files include observed daily precipitation, temperature, solar radiation, relative humidity, and wind speed for the study watershed. These data are all formatted based on SWAT requirements for 6 climate stations (see climate_stations_points.shp in the climate_stations_points directory). The format of these text files is the same as is described in the README file inside the observed_data > weather directory. These files were formatted for analysis in SWAT using the MatLab scripts in the pre-processing_scripts folder.

climate_2050-2070 Directory File List
Directory name: climate_2050-2070
Short description: This subdirectory includes the future climate files required to run SWAT for the CSIRO 4.5, CSIRO 8.5, Hadley 4.5, and MIROC 8.5 climate model simulations. See similarly named directory for each corresponding climate model. See README file inside this subdirectory for further details on the source of its contents.

Filename: *.txt files within each of the csiro_4.5, csiro_8.5, hadley_4.5, and miroc_8.5 directories
Short description: These text files include observed daily precipitation, temperature, solar radiation, relative humidity, and wind speed for the study watershed. These data are all formatted based on SWAT requirements for 6 climate stations (see climate_stations_points.shp in the climate_stations_points directory). The format of these text files is the same as is described in the README file inside the observed_data > weather directory. These files were formatted for analysis in SWAT using the MatLab scripts in the pre-processing_scripts folder.

climate_stations_30mbuffer Directory File List
Directory name: climate_stations_30mbuffer
Short description: This directory includes the files associated with the climate_stations_30mbuffer.shp file. The climate_stations_30mbuffer.shp includes six polygons with a 30m radius that is required for selecting climate stations that were used to download backcast climate and future climate from https://cida.usgs.gov/gdp/. Each station is named based on its proximity to a point in observed_data > weather > weather_stations_shp > weather_stations.shp.

climate_stations_points Directory File List
Directory name: climate_stations_points
Short description: This directory includes the files associated with the climate_stations_points.shp file. The climate_stations_points.shp includes six points associated with climate_stations_30mbuffer.shp. These points were not required for downloading data but are included for convenience.

pre-proccessing_scripts Directory File List
Directory name: pre-proccessing_scripts
Short description: This directory includes MatLab scripts which are required to reformat data downloaded from https://cida.usgs.gov/gdp/ for SWAT simulations.

Filename: pcpscript.m
Short description: This MatLab script reformats daily precipitation data from https://cida.usgs.gov/gdp/ for SWAT (i.e., for pXXX-XXXX.txt files).

Filename: rhscript.m
Short description: This MatLab script reformats daily relative humidity data from https://cida.usgs.gov/gdp/ for SWAT (i.e., for rXXX-XXXX.txt files).

Filename: srscript.m
Short description: This MatLab script reformats daily solar radiation data from https://cida.usgs.gov/gdp/ for SWAT (i.e., for sXXX-XXXX.txt files).

Filename: tmpscript.m
Short description: This MatLab script reformats daily temperature data from https://cida.usgs.gov/gdp/ for SWAT (i.e., for tXXX-XXXX.txt files).

Filename: wsscript.m
Short description: This MatLab script reformats daily wind speed data from https://cida.usgs.gov/gdp/ for SWAT (i.e., for wXXX-XXXX.txt files).

Relationship Between Files
The climate_stations_30mbuffer.shp is needed to download backcast and future climate from https://cida.usgs.gov/gdp/ for SWAT simulations. Reformatting scripts for data downloaded from https://cida.usgs.gov/gdp/ are included in the pre-proccessing_scripts directory.

Raw Data
There is no raw data included in this subdirectory.

2.2 landuse_2060 subdirectory

Directory name: landuse_2060
Short description: This directory includes four subdirectories with .img files containing future (2060) land use data for the four scenarios (A, B, C, D) discussed in Suttles et al. (2018). For additional details on the source of these files see the README file in this directory.

Filename: luc.txt
Short description: This is the land use code text file required to run future land use data in SWAT. See further description of this file in the README file in this directory.

lu_amc_2060 Directory File List
Directory name: lu_amc_2060
Short description: This directory includes the future lu_amc_2060.img land use file that was used for scenario A in Suttles et al. (2018).

lu_bmc_2060 Directory File List
Directory name: lu_bmc_2060
Short description: This directory includes the future lu_bmc_2060.img land use file that was used for scenario B in Suttles et al. (2018).

lu_cmc_2060 Directory File List
Directory name: lu_cmc_2060
Short description: This directory includes the future lu_cmc_2060.img land use file that was used for scenario C in Suttles et al. (2018).

lu_dmc_2060 Directory File List
Directory name: lu_dmc_2060
Short description: This directory includes the future lu_dmc_2060.img land use file that was used for scenario D in Suttles et al. (2018).

Relationship Between Files
These future land use files are described in further detail in the README file in this directory.

Raw Data
There is no raw data included in this subdirectory.

2.3 swat_outputs subdirectory

Directory name: swat_outputs
Short description: This directory includes many subdirectories that all have two SWAT output files (output.rch and output.sub) that are required for data analyses of each scenario described in Suttles et al. (2018). For further details on these scenarios see Suttles et al. (2018) and the README file in this directory. The README file also descibes how to interpret columns of each SWAT output file.

Relationship Between Files
For further details see the README file in this directory.

Raw Data
There is no raw data included in this subdirectory.

3. analysis_scripts directory

This directory contains three subdirectories: matlab_scripts, r_scripts, and summary_data_and_performance_metrics. Generally, these all have to do with final data analysis of SWAT ouptuts in the simulated_data > swat_outputs directory.

3.1 matlab_scripts subdirectory

Directory name: matlab_scripts
Short description: This subdirectory contains MatLab scripts that were used to carry flow-duration curve analysis, ks-tests, coefficient of variation (CV) calculations, and runoff ratio calculations as described in Suttles et al. (2018). The directories included are: flow_duration_curve_calculations, ks-test_and_cv, and runoff_ratio_calculations.

flow_duration_curve_calculations Directory File List
Directory name: flow_duration_curve_calculations
Short description: Includes MatLab scripts to carry out flow-duration curve analysis for subbasins 8, 10, 18, and 28. See the README file in this directory for more information.

ks-test_and_cv Directory File List
Directory name: ks-test_and_cv
Short description: Includes MatLab scripts to carry out KS-test and CV analysis.

runoff_ratio_calculations Directory File List
Directory name: runoff_ratio_calculations
Short description: Includes MatLab scripts to carry out runoff ration analysis for subbasins 8, 10, 18, and 28. See the README file in this directory for more information.

Relationship Between Files
For further details see the README files in this directory as well as Suttles et al. (2018).

Raw Data
There is no raw data included in this subdirectory.

3.2 r_scripts subdirectory

Directory name: r_scripts
Short description: This directory includes four subdirectories: figure_scripts, high_low_frequency_calcs, script_outputs, sevenday_calcs. The figure_scripts directory includes all R scripts associated with figures from Suttles et al. (2018), the high_low_frequency_calcs includes all R scripts associated with high and low flow frequency analysis in Suttles et al. (2018), the sevenday_calcs directory includes R scripts to calculate seven day max and min flow statistics, and tbe script_outputs directory includes some outputs and associated figures that were supplied on the behalf of reviewer comments of Suttles et al. (2018).

3.3 summary_data_and_performance_metrics subdirectory

Directory name: summary_data_and_performance_metrics subdirectory
Short description: This subdirectory contains three Excel files that were used to carry out data analysis presented in Suttles et al. (2018).

Filename: log_streamflow_performance_metrics.xlsx
Short description: This Excel file was used to calcuate SWAT model performance metrics as described in Suttles et al. (2018).

Filename: streamflow_performance_metrics.xlsx
Short description: This Excel file was used to calcuate SWAT model performance metrics as described in Suttles et al. (2018) for each of the three USGS gages.

Filename: obs_sim_streamflow_summary_all_gages.xlsx
Short description: This Excel file was used to calcuate SWAT model performance metrics as described in Suttles et al. (2018) for each of the three USGS gages.

Relationship Between Files
These summary files originate from the calibration and validation SWAT ouptputs in simulated_data > swat_outputs for each of the three USGS stream gages and were used.

Raw Data
There is no raw data included in this subdirectory.

Methodological Information

Description of methods used for collection/generation of data:
See the associated Science of the Total Environment journal article for a full description of the methods used to collect and analyze these data.

Methods for processing the data:
See the R scripts in this repository as well as the associated Science of the Total Environment journal article for a full description of the methods used to collect and analyze these data.

Instrument- or software-specific information needed to interpret the data:
R (open-source, https://www.r-project.org/) is needed to run .R files, Microsoft Excel (license required, https://products.office.com/en-us/excel) is needed to open .xlsx files, and Matlab (license required, https://www.mathworks.com/products/matlab.html) is needed to run .m files. Land use and land cover data can be opened using ArcGIS (license required, desktop.arcgis.com/en/) or QGIS (open-source, https://qgis.org/en/site/).

Standards and calibration information, if appropriate:
Information on calibrations are included in the 'Raw Data' section of this README file.

Environmental/experimental conditions:
See the associated Science of the Total Environment journal article for a full description of observed and modeled data used in this study.

Describe any quality-assurance procedures performed on the data:
SWAT simulations were calibrated and validated. This is described in further detail in Suttles et al. (2018). When possible, data analysis was automated in MatLab and R to ensure consistency.

People involved with sample collection, processing, analysis and/or submission:
See the associated Science of the Total Environment journal article for a full description of author contributions and acknowledgments.

Data-Specific Information For: swat_precip_summary_outlet_1982-2002.xlsx

Variable list
'RCH' - Reach number. Reach 28 is the outlet of the watershed.
'MO' - Month
'DA' - Day
'YR' - Year
'PRECIPmm' - Precipitation in mm.

Missing data codes
No missing data codes.