openaddresses/pyesridump

`fields` parameter never works with pagination

Opened this issue · 1 comments

Overview

Hey, thanks for the great package!

I'm trying to grab data off an esri mapserver API using the EsriDumper class, and only want to grab a few fields (hopefully speeding up download speed and decreasing file size).

What I expected to happen

When I use list(EsriDumper(url, fields=['OBJECTID'])), I get back geometries with only the OBJECTID properties.

What actually happens

I get all of the properties regardless of the fields setting.

Explanation

I noticed that even when I set fields=['OBJECTID'], I was getting back every field anyways. The cause is line 127 in dumper.py:

return data.get('error') and data['error']['message'] != "Failed to execute query."

I think the intention is to return False when the response has an error field and it doesn't have that exact message, and return True in all other cases.

What actually happens is that when there is no error, data.get('error') evaluates to None, and this is then the return value of can_handle_pagination. This gets coerced into a boolean later on in the __iter__ method:

if query_fields and not self.can_handle_pagination(query_fields):

So, the actual logic is that it will only think it can handle pagination when it receives an error other than "Failed to execute query".

Solution

I think this achieves the intended logic in can_handle_pagination:

if 'error' in data and 'message' in data['error']:
    return data['error']['message'] != "Failed to execute query."
return True

As a follow-up: esri mapservers seem to require OBJECTID in the outFields parameter, so this should probably be added by default.