Frog output on pretokenised FoLiA input do not make it into the FoLiA output
proycon opened this issue · 2 comments
proycon commented
Something is wrong still when processing pre-tokenised FoLiA, the annotations are not making it into the FoLiA output (but they DO make it to the stdout columned output!)
Input: https://lst.science.ru.nl/~proycon/issue72_a.xml
Command: $ frog --skip=tmncpa --language=nld issue72_a.xml -X test.xml
The same thing also occurs when not skipping the tokeniser explicitly, and when not adding --language=nld
kosloot commented
hmm, interesting....
so the work is done and than forgotten. Will look into it
kosloot commented
Ok, so the problem was, that when skipping the tokenizer, the text was not assigned a language at all. Even not "default". And because of the mismatch, a part of the processing was skipped.
This should be fixed now. Please test