ahartikainen/pyinfraformat

gtk wfs parse fails

Closed this issue · 2 comments

In some cases owslib fails to parse gtk wfs output. Might be due to gtk:s output is invalid, not sure.

from pyinfraformat import from_gtk_wfs
bbox = [60.1719393074526, 24.938902856956705, 60.18047661573734, 24.980959894310224]
holes = from_gtk_wfs(bbox, coord_system="WGS84", maxholes=200)

Traceback (most recent call last):
File "C:\Users\user\Miniconda3\envs\spatial\lib\site-packages\IPython\core\interactiveshell.py", line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 3, in
holes = from_gtk_wfs(bbox, coord_system="WGS84", maxholes=200)
File "C:\Users\user\Miniconda3\envs\spatial\lib\site-packages\pyinfraformat\core\io.py", line 146, in from_gtk_wfs
wfs_io = wfs.getfeature(
File "C:\Users\user\Miniconda3\envs\spatial\lib\site-packages\owslib\feature\wfs100.py", line 288, in getfeature
u = openURL(base_url, data, method, timeout=self.timeout,
File "C:\Users\user\Miniconda3\envs\spatial\lib\site-packages\owslib\util.py", line 215, in openURL
se_tree = etree.fromstring(req.content)
File "src/lxml/etree.pyx", line 3237, in lxml.etree.fromstring
File "src/lxml/parser.pxi", line 1896, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1784, in lxml.etree._parseDoc
File "src/lxml/parser.pxi", line 1141, in lxml.etree._BaseParser._parseDoc
File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
File "", line 10132
XMLSyntaxError: PCDATA invalid Char value 26, line 10132, column 6

With overlapping area works just fine:

bbox = [60.16689651020829, 24.922189711360264, 60.1839715287751, 25.006303786067292]
holes = from_gtk_wfs(bbox, coord_system="WGS84", maxholes=200)

There is an invalid character "\x1a" in GTK output, reason why xml parse fails.

url = "http://gtkdata.gtk.fi/arcgis/services/Rajapinnat/GTK_Pohjatutkimukset_WFS/MapServer/WFSServer?service=WFS&version=1.0.0&request=GetFeature&bbox=385646.3002644322%2C6672344.5539304055%2C388008.2847590853%2C6673222.970126985&srsname=EPSG%3A3067&typename=Pohjatutkimukset&propertyname=%2A&maxfeatures=1000"
import requests

r = requests.request("GET",  url)
txt = r.content.decode('UTF-8')
txt[txt.find("    3.20   88.00  Ka")-50:][:100]

' Ka\r\n 2.80 83.00 Ka\r\n 3.00 83.00 Ka\r\n 3.20 88.00 Ka\r\n-1 KA\x1a</Rajapinnat_GTK_Pohja'

Can we extract json from there?