datadesk/census-map-downloader

`FIELD_CROSSWALK` incorrect for cartographic congressional districts

Closed this issue · 1 comments

ghing commented

Running censusmapdownloader --data-dir data congress-cart produces this error:

Traceback (most recent call last):
  File "/home/codespace/.local/bin/censusmapdownloader", line 11, in <module>
    load_entry_point('census-map-downloader', 'console_scripts', 'censusmapdownloader')()
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/workspaces/census-map-downloader/census_map_downloader/cli.py", line 77, in congress_carto
    obj.run()
  File "/workspaces/census-map-downloader/census_map_downloader/base.py", line 53, in run
    self.process()
  File "/workspaces/census-map-downloader/census_map_downloader/base.py", line 138, in process
    trimmed = gdf[list(self.FIELD_CROSSWALK.keys())]
  File "/home/codespace/.local/lib/python3.8/site-packages/geopandas/geodataframe.py", line 1299, in __getitem__
    result = super(GeoDataFrame, self).__getitem__(key)
  File "/home/codespace/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 3030, in __getitem__
    indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
  File "/home/codespace/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1266, in _get_listlike_indexer
    self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
  File "/home/codespace/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1316, in _validate_read_indexer
    raise KeyError(f"{not_found} not in index")
KeyError: "['NAMELSAD'] not in index"

The fields in FIELD_CROSSWALK don't seem to match the fields listed in the documentation PDF in the comments: https://www2.census.gov/geo/tiger/GENZ2018/2018_file_name_def.pdf.

I'll fix this once I get through a bit more testing on adding support for different vintages as part of #8, but wanted to document this somewhere.

ghing commented

These are the fields that are actually in the shapefile:

ogrinfo -so data/raw/cb_2018_us_cd116_500k.shp cb_2018_us_cd116_500k
INFO: Open of `data/raw/cb_2018_us_cd116_500k.shp'
      using driver `ESRI Shapefile' successful.

Layer name: cb_2018_us_cd116_500k
Metadata:
  DBF_DATE_LAST_UPDATE=2019-04-15
Geometry: Polygon
Feature Count: 441
Extent: (-179.148909, -14.548699) - (179.778470, 71.365162)
Layer SRS WKT:
GEOGCRS["NAD83",
    DATUM["North American Datum 1983",
        ELLIPSOID["GRS 1980",6378137,298.257222101,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["latitude",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["longitude",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4269]]
Data axis to CRS axis mapping: 2,1
STATEFP: String (2.0)
CD116FP: String (2.0)
AFFGEOID: String (13.0)
GEOID: String (4.0)
LSAD: String (2.0)
CDSESSN: String (3.0)
ALAND: Integer64 (14.0)
AWATER: Integer64 (14.0)