XML comments cause error in parsenator train
Opened this issue · 2 comments
az0 commented
The error message can be seen in Travis build 99
Traceback (most recent call last):
File "/home/travis/virtualenv/python2.7.14/bin/parserator", line 11, in <module>
sys.exit(dispatch())
File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/parserator/main.py", line 58, in dispatch
args.func(args)
File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/parserator/main.py", line 85, in train
training.train(module, train_file_list, modelfile)
File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/parserator/training.py", line 83, in train
training_data = list(readTrainingData(train_file_list, module.GROUP_LABEL))
File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/parserator/training.py", line 62, in readTrainingData
sequence_xml = etree.fromstring(component_string)
File "src/lxml/etree.pyx", line 3212, in lxml.etree.fromstring
File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1764, in lxml.etree._parseDoc
File "src/lxml/parser.pxi", line 1126, in lxml.etree._BaseParser._parseDoc
File "src/lxml/parser.pxi", line 600, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 710, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 639, in lxml.etree._raiseParseError
File "<string>", line 1
lxml.etree.XMLSyntaxError: Document is empty, line 1, column 1
make: *** [probablepeople/generic_learned_settings.crfsuite] Error 1
Would please allow XML comments? It could help organize the training and test data sets
fgregg commented
It should be possible https://stackoverflow.com/questions/18313818/how-to-not-load-the-comments-while-parsing-xml-in-lxml, it will need to be changed in parserator.
az0 commented
Today XML comments are working fine