Python error when using UTF8 characters
dixego opened this issue · 2 comments
dixego commented
It seems that coquille can't parse UTF8 characters (or maybe non-ascii characters?); when trying to use a file with spanish accented characters I get the following error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/dixego/.vim/bundle/coquille/autoload/coquille.py", line 129, in coq_next
send_until_fail()
File "/home/dixego/.vim/bundle/coquille/autoload/coquille.py", line 321, in send_until_fail
response = CT.advance(message, encoding)
File "/home/dixego/.vim/bundle/coquille/autoload/coqtop.py", line 247, in advance
r = call('Add', ((cmd, -1), (cur_state(), True)), encoding)
File "/home/dixego/.vim/bundle/coquille/autoload/coqtop.py", line 195, in call
msg = ET.tostring(xml, encoding)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1126, in tostring
ElementTree(element).write(file, encoding, method=method)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 820, in write
serialize(write, self._root, encoding, qnames, namespaces)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 939, in _serialize_xml
_serialize_xml(write, e, encoding, qnames, None)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 939, in _serialize_xml
_serialize_xml(write, e, encoding, qnames, None)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 939, in _serialize_xml
_serialize_xml(write, e, encoding, qnames, None)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 937, in _serialize_xml
write(_escape_cdata(text, encoding))
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1073, in _escape_cdata
return text.encode(encoding, "xmlcharrefreplace")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7: ordinal not in range(128)
however if I remove all accented characters there's no problem anymore. I should note that all accented characters were within comments, not as part of the code.
anderslundstedt commented
Here is a minimal working example in Coq:
Definition n := 0.
Notation "∀" := n.
It seems like Python's xml.etree.ElementTree.tostring
does not handle Unicode correctly. Here is a minimal working example triggering the error in Python:
# This Python file uses the following encoding: utf-8
import xml.etree.ElementTree
element = xml.etree.ElementTree.Element("")
element.text = "∀"
print(xml.etree.ElementTree.tostring(element, "utf-8"))