datamade/probablepeople

XML comments cause error in parsenator train

Opened this issue · 2 comments

az0 commented

The error message can be seen in Travis build 99

Traceback (most recent call last):
  File "/home/travis/virtualenv/python2.7.14/bin/parserator", line 11, in <module>
    sys.exit(dispatch())
  File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/parserator/main.py", line 58, in dispatch
    args.func(args)
  File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/parserator/main.py", line 85, in train
    training.train(module, train_file_list, modelfile)
  File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/parserator/training.py", line 83, in train
    training_data = list(readTrainingData(train_file_list, module.GROUP_LABEL))
  File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/parserator/training.py", line 62, in readTrainingData
    sequence_xml = etree.fromstring(component_string)
  File "src/lxml/etree.pyx", line 3212, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1764, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1126, in lxml.etree._BaseParser._parseDoc
  File "src/lxml/parser.pxi", line 600, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 710, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 639, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Document is empty, line 1, column 1
make: *** [probablepeople/generic_learned_settings.crfsuite] Error 1

Would please allow XML comments? It could help organize the training and test data sets

az0 commented

Today XML comments are working fine