xmunoz/sodapy

SoQL query support for get_all()

kuirolo opened this issue · 3 comments

While I was trying to submit a SoQL query using the get_all() method, it kept throwing an error:

HTTPError: 400 Client Error: Bad Request.
        If $query is used, all options - [$offset] should not be specified in $query.

The official documentation isn't super detailed It looks like the problem is that all options must be in the SoQL query if you use a SoQL query.

My code using get_all is below. But you can reproduce this error with the get() method by including a query parameter with any other SoQL clause.

I came up with a functional solution, but wasn't sure how to update the test or documentation.

However, I thought I would open an issue before making a pull request. For most cases, filtering with the other API parameters is ok. Perhaps the simplest solution is changing the get_all() docstring to note SoQL queries aren't supported.

import json
from sodapy import Socrata

client_params = {'domain': 'health.data.ny.gov',
                'app_token': None,
                'timeout': 10} 

request_params = {'dataset_identifier': 'vn5v-hh5r',
                  'content_type': 'json',
                  'query': """SELECT facility_name, fac_desc_short
                              WHERE fac_desc_short like '%HOSP%'"""}

with Socrata(**client_params) as client:
    response = client.get_all(**request_params)

# Iterating over the generator throws HTTPError error
resp_list = [r for r in response]

# If it actually works, put data here
with open('query_test.json', 'w') as f:
    json.dump(resp_list, f, indent=4)

Thanks for opening this issue! You are absolutely correct, get_all does not support SoQL queries, but if you look at the get implementation and tests, you can see how an example of how to get that working.

Of course! Thanks for maintaining this package, it's the foundation of some important code for my work.

Ok, I'll look at get again.

In the meantime, should I submit a pull request for my version of get_all that supports SoQL queries? Sorry, I should have been more clear in my initial message.

My "functional solution" is an expanded version of get_all that can paginate using either $limit and $offset or a $query string. The trade off is its significantly more complex than the current implementation, so I understand if its not something you want to support in the main branch.

Right now, for all of the endpoints, pagination is up to the user. I think that I would like to preserve this consistency. If you want to add some docs about how to do pagination using $limit and $offset, I wouldn't be opposed to that.