geopython/pywps

How to handle CDATA in xml template?

cehbrecht opened this issue · 2 comments

Description

Currently the CDATA tag is added to the json dump/load of a WPSRequest but it is only needed by the xml templates.
Maybe we have a better may to implement this.

See #555 and #444.

Environment

  • operating system:
  • Python version:
  • PyWPS version: 4.2.9
  • source/distribution
  • git clone
  • Debian
  • PyPI
  • zip/tar.gz
  • other (please specify):
  • web server
  • Apache/mod_wsgi
  • CGI
  • other (please specify):

Steps to Reproduce

Additional Information

Hello,

I currently working on that topic, my opinion is that we can remove CDATA every where. Here is my current analysis of the situation. At the input side we can remove it without trouble, at the output side it's a little bit more difficult due to a miss leading code. The current usage of json properties is miss leading because it's is used in different cases, for input it is used to serialize data to store in json format within the data base, for the output it is used in template to generate the output XML. Moreover the complex data can be text or binary data and the json format cannot serialize binary data, thus data have to be converted to string. The current solution is to use base64. To summarize:

python object -> json representation -> database store
python object -> json representation -> XML output.

In my opinion the json representation should be the same in both case and should ignore the issue of the final usage of the json representation. This mean that we should not care of the data escaping within the json representation, we just want to ensure it's a valid json structure. This mean that the object tree of json representation must be dictionary, list, numbers and str. The data escaping is the responsibility of the XML template. Thus currently I push the escaping into the template, and the most convenient way is to escape forbidden chars: < > and & instead of using complicated CDATA escape.

As a side remark when etree parse the XML document he remove all escaping, including CDATA, and when he want to put back into XML string it escape only forbidden characters.

Best regards

Hello,

I did removed completely the CDATA pattern and moved all escaping/encoding for XML into templates. This turn json serialization and de-serialization more simple and more sane.

gschwind/PyWPS@3893268