Unable to dump source
Opened this issue · 8 comments
Trying to dump https://gismaps.sedgwickcounty.org/arcgis/rest/services/Map/Op_SiteAddress_Dynamic_SP/MapServer/0
returns this error:
2017-03-01 11:44:36,011 - cli.esridump - ERROR - Could not parse response from https://gismaps.sedgwickcounty.org/arcgis/rest/services/Map/Op_SiteAddress_Dynamic_SP/MapServer/0/query?returnCountOnly=true&where=1%3D1&f=json as JSON:
<html><head><title>Request Rejected</title></head><body>The requested URL was rejected. Please consult with your administrator.<br><br>Your support ID is: 11833783245905056836</body></html>
How does one go about dumping a source that is locked down tightly like this one? Is it possible to do with pyesridump as-is? If so, could documentation be added so people who aren't well-versed in arcgis/esri/whatever-term can try different approaches to dealing with problematic servers (this is my situation)?
It looks like their firewall/proxy is misconfigured. Their webmap has query functionality (that makes a similar request as us) that is failing right now because of this error you're seeing with pyesridump:
Okay, regardless of misconfigured firewall/proxy on their part, is there any way right now that pyesridump can scrape the data? I tried passing a custom WHERE
but go the same thing:
$ esri2geojson -p "WHERE=OBJECTID > 1" https://gismaps.sedgwickcounty.org/arcgis/rest/services/Map/Op_SiteAddress_Dynamic_SP/MapServer/0 us-ks-sedgwick.geojson
2017-03-01 12:51:57,608 - cli.esridump - ERROR - Could not parse response from https://gismaps.sedgwickcounty.org/arcgis/rest/services/Map/Op_SiteAddress_Dynamic_SP/MapServer/0?WHERE=OBJECTID+%3E+1&f=json as JSON:
<html><head><title>Request Rejected</title></head><body>The requested URL was rejected. Please consult with your administrator.<br><br>Your support ID is: 11833783245905262104</body></html>
No, I don't think there's a way pyesridump can handle this server if it won't respond to the /query
endpoint.
But you mentioned a brute force method... Did you try it on this layer and it worked?
I have seen this behavior before on Esri servers. It's pretty rare but does happen.
I can make this request as-is in both wget and chrome and get a JSON response:
https://gismaps.sedgwickcounty.org/arcgis/rest/services/Map/Op_SiteAddress_Dynamic_SP/MapServer/0/1?f=json
So I assume that by starting at 1 (or wherever) and just incrementing until the server returns a 400/404, the data can be scraped.
That will work for this layer because the first OID is 1
and the next one is 2
, etc., but in lots of other layers I look at the first OID is not 1 and there are gaps between the OIDs (this is why I do the OID enumeration method), so this won't be repeatable.
The other annoying thing with this is that the Esri server won't do the reprojection like we do via /query for you when you query by OID.
I've thought about ways to handle id gaps/offset but until a problematic source comes up, simple incrementing will do for now.
Hi, just passing by, is possible dump with identify too.