cfpb/grasshopper

Fix CensusGeocoder address number range queries

Opened this issue · 0 comments

As mentioned on #211 (comment), I commented out houseQuery from CensusGeocoder.searchAddress() (see: 6dd5d58#diff-6b3a25548e48dfe8c1891ebf6afef9efR87) in order to get it working post ES 2.2 upgrade.

While working through the upgrade, I discovered 2 other bugs related to houseQuery.

  1. We're using string datatype for address number range fields (LFROMHN, RFROMHN, LTOHN and RTOHN). Unfortunately, this does not perform as expected on ES Range Query since string ranges are calculated lexicographically, not numerically. The result is "1000" < "200" < "30".

    The simple fix to just change grasshopper-loader's censusType. However, this is complicated by the fact that not all address numbers are just numeric. Some address ranges have letters and other characters like - as well.

    There are probably a few approaches for fixing this, but they'll most likely involve changes to both the loader and CensusGeocoder.

  2. We're assuming that the FROM values will always be less than the TO values . Unfortunately, that is not the case as seen in the example below.

            {
               "type": "Feature",
               "properties": {
                  "TLID": 84836894,
                  "TFIDL": 214102038,
                  "TFIDR": 214101073,
                  "ARIDL": "4003966015873",
                  "ARIDR": null,
                  "LINEARID": "1103732644476",
                  "FULLNAME": "Rockefeller",
                  "LFROMHN": "398",
                  "LTOHN": "320",
                  "RFROMHN": null,
                  "RTOHN": null,
                  "ZIPL": "71832",
                  "ZIPR": null,
                  "EDGE_MTFCC": "S1400",
                  "ROAD_MTFCC": "S1400",
                  "PARITYL": "E",
                  "PARITYR": null,
                  "PLUS4L": null,
                  "PLUS4R": null,
                  "LFROMTYP": null,
                  "LTOTYP": null,
                  "RFROMTYP": null,
                  "RTOTYP": null,
                  "OFFSETL": "N",
                  "OFFSETR": "N",
                  "STATE": "AR"
               },
               "geometry": {
                  "type": "LineString",
                  "coordinates": [
                     [
                        -94.343073,
                        34.033726
                     ],
                     [
                        -94.34307,
                        34.034189
                     ]
                  ]
               }
            }