Regression: Frog no longer outputs both FoLiA XML and columned output when requested
proycon opened this issue · 2 comments
Invoking frog as follows:
frog --language=nld -t test.txt -n --skip=npc--id=test -X test.xml -o test.out
This used to produce both a test.out
file and a text.xml
file. Now it only produces columned output and disregards the FoLiA.
The frog webservice relies on this behaviour to produce both files in a single pass and now breaks on this.
Well, I cannot reproduce this:
$ cat > test.txt
Dit is een test.
^D
$ frog --language=nld -t test.txt -n --skip=npc --id=test -X test.xml -o test.out
frog 0.32 (c) CLST, ILK 1998 - 2023
CLST - Centre for Language and Speech Technology,Radboud University
ILK - Induction of Linguistic Knowledge Research Group,Tilburg University
based on [ucto 0.31, libfolia 2.18, timbl 6.10, ticcutils 0.35, mbt 3.11]
...
...
frog-:Frogging in total took: 0 seconds, 6 milliseconds and 822 microseconds
frog-:FoLiA stored in test.xml
frog-:results stored in test.out
frog-:Sat Nov 25 09:02:45 2023 Frog finished
$ more test.out
1 Dit dit [dit] VNW(aanw,pron,stan,vol,3o,ev) 0.777085
2 is zijn [zijn] WW(pv,tgw,ev) 0.999891
3 een een [een] LID(onbep,stan,agr) 0.999113
4 test test [test] N(soort,ev,basis,zijd,stan) 0.903055
5 . . [.] LET() 1.000000
$ more test.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="folia.xsl"?>
<FoLiA xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://ilk.uvt.nl/folia" xml:id="test" generator="libfolia-v2.18" version="2.5.4">
<metadata type="native">
<annotations>
So both output formats are produced!
What version of Frog and other tools do you use?
Ok there are some problems in the MBMA data-files .
I implemented a quick fix, but it should be better to fix the data. (which is a tedious job)
NOTE: in fact this is not a bug, Frog is unable to handle (part of) an input and continues with the next file.
But this is a surprise for the user. NO FoLia can be created, but some partial Tabbed output is possible (up to the problem)