chris-prener/censusxy

Inconsistent geocoding (singe vs batch)

samika27 opened this issue · 1 comments

I've run into an issue with the package. I have a large dataset to geocode. When using cxy_geocode, I end up with many NA for location. But when I try cxy_single for a random sample from those addresses, locations are returned. Any idea why this might be?

Here's one example of an address where this is happening.

g<-cxy_single('3256 n halsted st','chicago','il',return = 'geographies', vintage = 'Current_Current')

hi @samika27 - the most likely situation is you have an address that is not properly formatted per USPS guidelines. The batch function breaks up your data frame into chunks, and sends those chunks to the Census geocoder. When one chunk has an issue, the Census geocoder is set up to return the entire batch back without geocodes. This is why you can get successful single geocodes but it does not work in batch.

We don't have any control over this, unfortunately. My general recommendation in this case is to break up your data frame into smaller bits, and geocode those separately to try and isolate where the malformed addresses are.