Doesn't work for simple example
Closed this issue · 10 comments
<dictionary type="sequential">
<sdefs>
<sdef n="det"/>
<sdef n="abl"/>
<sdef n="dem"/>
<sdef n="n"/>
<sdef n="cnjadv"/>
</sdefs>
<section id="main" type="standard">
<e><p><l>bu<s n="det"/><s n="dem"/><j/>yüz<s n="n"/><s n="abl"/></l>
<r>bu<b/>yüzden<s n="cnjadv"/></r></p></e>
</section>
</dictionary>
Then compile:
$ lsx-comp lr apertium-tur-uzb.tur-uzb.lsx tur-uzb.autosep.bin
main@standard 11 10
Show the transducer:
$ lt-print tur-uzb.autosep.bin
0 1 b b 0.000000
1 2 u u 0.000000
2 3 <det> 0.000000
3 4 <dem> y 0.000000
4 5 <$> ü 0.000000
5 6 y z 0.000000
6 7 ü d 0.000000
7 8 z e 0.000000
8 9 <n> n 0.000000
9 10 <abl> <cnjadv> 0.000000
10 0.000000
But it doesn't work:
$ echo "^bu<det><dem>$ ^yüz<n><abl>$" | lsx-proc tur-uzb.autosep.bin
^bu<det><dem>$ ^yüz<n><abl>$
Expected output is:
^bu yüzden<cnjadv>$
@jonorthwash @itang1 @unhammer any ideas?
This is exactly the sort of problem we were having with apertium/apertium-eng-deu#4. We never really reported it officially.
I hope this issue solves
You might need <t>
and/or <g>
somewhere. See examples at #2 (comment).
It shouldn't need <t>
because the tags are fixed. is only if there is #
right? Or does it have another meaning?
@xavivars, @hectoralos, any thoughts about what @ftyers is doing wrong here?
I don't really know much about apertium-separable, on top of having fixed the null-flushing (I hope!). But honestly, now very little about the format.
On the issue you link to, I think I just played with Hector's rule until it worked...
Unfortunately, so is my knowledge of the module (which needs a better documentation). I just multiplied some constructions Fran wrote in fra-cat. I've been comparing what it is used in apertium-fra-cat and what is in this example, and in fra-cat there are a couple of <t/>
. I've been trying if adding <t/>
will help, but I couldn't match ^bu<det><dem>$ ^yüz<n><abl>$
.
@ftyers, try the code I committed in apertium/apertium-tur-uzb@5081938. It works for me now.
Basically I just added <j/>
.
Great, that should definitely go in the documentation, or alternatively the compiler should be updated to automatically add <j/>
at the end of every entry.
I would file a new issue about not working without <j/>
, suggesting that solution.