
Starter kit for the Bayes Hack 2016 interior prompt http://bayeshack.org/interior.html

MIT LicenseMIT

Bayes Hack Interior Prompt

How can data make access to the outdoors more equitable?


Research suggests time in nature is an important component of physical, social, and mental wellbeing, but also that various barriers to access disproportionately impact urban communities 1 2. By making open spaces, parks, and public lands easier to reach and explore, more citizens can be part of the benefits that these lands afford society.

Transform droves of data about America's public lands into tools to drive new planning and policy or integrate models into applications that break down the inequities facing socially and economically underserved communities.

Increasingly, computational tools are becoming a part of how we understand and approach our great outdoors. Through a better understanding of the interactions between individuals and their natural environment, we can inform novel data-driven, cost-effective, and citizen-centric approaches.

In this repo

  • analysis/ - Jupyter notebook files (which you can view right here on GitHub) loading the data and exploring a few things.

Getting started

Install all the Python dependencies needed for the notebook by executing pip install -r requirements.txt.

Available data sources

This paragraph will list the datasets we identified as interesting when we had a quick look at the available sources. This is however not a comprehensive list and we encourage you to dig deeper into the portal.

Recreation Information Database

This an http API with extensive documentatin that can be found here. Register for an API key here and check out my example notebook.

In case you don't feel like using their API, you can download the whole dataset from here.

Data Dictionary for RIDB and NCSU




There are three NCSU datasets available at [https://cnr.ncsu.edu/geospatial/bayes-hack/] (https://velocity.ncsu.edu/dl/1uRTn6g/276112)

The NRRS_PPL_reservationdata_AllYears.csv file is the cleaned RIDB data compiled by NCSU. There are two additional datasets that are aggregates of this information. This section identifies the differences between the two datasets. Contact Stacy if you use Matlab and want the .mat file for this dataset.

This dataset includes most of the original RIDB fields along with newly calculated fields by NCSU. The additional fields are:

  1. Great circle distance - distance between visitor ZIP centroid and destination facility x y in km.
  2. Duration - length of stay. Difference between arrival and departure in days.
  3. Lead time - difference between reservation order date and vacation start date.
  4. Person nights - duration x number of ppl in the party. Say I have two people in my party staying for 5 nights, total person nights would be 10.

The Integrated Resource Management Applications Portal (IRMA)

The portal offers a large amount of datasets in different formats. Notable categories are:

Starts with a graphical frontend to browse datasets for different parks (probably also possible to access the data programatically). There is pretty cool data for each park, for example

Get species lists with the occurrence and status of species in more than 300 NPS national parks. (example)

The U.S. Fish & Wildlife Service

On their website hosts comprehensive geospatial data describing parks and ecosystems. The highlights are:

Bureau of Land Management GeoCommunicator

The Bureau of Land Management GeoCommunicator hosts overviews of the administrative status of BLM areas and other federal lands.

  • Oil and Gas lease sale parcels
  • BLM Administrative Areas
  • Federal Lands
  • Federal National Monuments, Conservation Areas, and Wilderness Areas
  • BLM Range Allotments and Pastures
  • BLM Wild Horse and Burro Herd Areas and Herd Management Areas
  • BLM Solar Energy Study Areas
  • Public Land Survey System - PLSS (township, range, section, lots, surveys) - Downloadable
  • Rights-of-Way (ROW)