This project creates a geographic crosswalk between 2017 U.S. Department of Housing and Urban Development (HUD) Continuum of Care (CoC) boundaries and 2017 U.S. Census Bureau geographies (Census tracts and counties).
If using any of the files from this project for published work, please cite this Github repository as their source.
In describing the methodology used to match counties and CoCs, please also cite the original paper that describes the basic methodology used in conducting the county-CoC geographic crosswalk.
Please bring any errors/questions/suggestions to the attention of this project's creator, Tom Byrne, at tbyrne@bu.edu
Below we describe the main outputs of this project, as well as its data inputs and programs used to create the outputs.
There are three main output files from this project:
-
tract_coc_match.csv: This is a geographic crosswalk that matches each Census tract to a CoC. There is one row for each Census tract. Note that not all Census tracts match to a CoC.
-
county_coc_match.csv: This is a geographic crosswalk that matches counties to CoCs. CoCs can match to multiple counties, and a single county can match to each CoC. Thus, the file has one row for each unique county-CoC combination. Note that not all counties match to a CoC.
-
coc_population.csv: This is a file that includes the total population and total population in poverty for each CoC. These files are based on tract level total population and total population in poverty estimates from the U.S. Census Bureau's American Community Survey 2011-2016 5-Year Estimates
There are also two intermediary output files:
-
clipped_tract.shp: This is a version of the U.S. Census Bureau TIGER/Line census tract boundary shapefile that is clipped to the HUD CoC boundary shapefile. The reason for doing this is that the CoC shapefile is clipped to the shoreline, while the tract boundary file is not. As such, our approach for matching tracts to CoCs will incorrectly omit Census tracts if we do not first clip the tract boundaries.
-
tract_population.csv: This file includes Census tract estimates of the total population and total population in poverty from the U.S. Census Bureau's American Community Survey 2011-2016 5-Year Estimates.
The above described outputs are created using the following inputs:
-
CoC_GIS_NatlTerrDC_Shapefile_2017.gdb: A shapefile of the 2017 HUD CoC boundaries. This file was obtained from this HUD website. A zipped version of this file is included in the data folder.
-
tlgdb_2017_a_us_substategeo.gdb: The 2017 TIGER/Line Census tract shapefile. This file was obtained from this Census Bureau website. This file is too large to store on Github but can be obtained at the link.
-
2017_pit.csv: The 2017 HUD Point-in-Time (PIT) count data. This file was obtained from this HUD website
The outputs described above were created using data described above via the following programs:
-
000_clip_tract_shapefile: This program clips the TIGER/Line tract shapefile to the CoC boundary shapefile. It produces the clipped_tract.shp file.
-
100_tract_population: This program uses the tidycensus package to pull Census tract population and population and poverty estimates directly from the U.S. Census Bureau's API. It produces the tract_population.csv output file
-
200_tract_coc_match: This program creates a geographic crosswalk between Census tracts and CoCs. To do this, it overlays tract centroid points (i.e. points representing the geographic center of each Census tract) onto CoC boundaries and matches each Census tract to the CoC into which its centroid falls. It produces the tract_coc_match.csv output file
-
300_coc_population: This program creates estimates of the total population and total population in poverty in each CoC. It does this based on the tract_coc_match.csv file and produces the coc_population.csv shapefile
-
400_county_coc_match: This program creates a geographic crosswalk between counties and CoCs. It does this based on the tract_coc_match.csv file and produces the county_coc_match.csv output.
-
500_run_all_pograms: This program runs simply calls each of the above programs in sequence.