coli-saar/alto

Incorrect derivations on "larger" grammars

akoehn opened this issue · 12 comments

I have a curious case where I grew my grammar with 46 rules to 50 and derivations only needing the initial 46 rules suddenly changed (i.e., the derivation produced by viterbi is correct but not optimal)

The derivations I am looking at (both the correct one and the incorrect one) do not use any of the four new rules.

Commenting out any(!) two unrelated rules (two of the new ones or two of the old ones) fixes the derivation. Therefore, this seems to be an alto bug to me.

This is just an initial issue description, will update later with more info once I had some time to look into this myself.

Does the grammar have two rules with the same terminal symbol?

No. And as I said, I can comment out two of the new rules and everything works or I keep all four of the new rules and comment out two of the old rules (which worked fine before) and everything works -- except derivations with the two commented-out rules of course.

And we should really have a check for the same terminal symbol case while reading an irtg -- this is a bad trap people can fall into.

Is this problem still there? Could you post the grammar so I can reproduce the behavior?

Yes, steps to reproduce:

git clone git@github.com:minecraft-saar/minecraft-nlg.git
cd minecraft-nlg
./gradlew test
# observe that all tests are successful
# now re-enable three rules I commented out. For the lazy:
sed -i '190,190 d; 200,200 d' src/main/resources/de/saar/coli/minecraft/minecraft.irtg
./gradlew test
# observe that the last test fails even though the new rules have nothing to do with the generated text

It seems that merely adding two rules that are not even part of the now resulting tree prohibits the correct (shortest) tree to be found:
AssertionFailedError: expected: <build a row in front of the previous row> but was: <build a row to the right of length four in front of the previous row>
Notice that the re-enabled rules are for describing height and neither the expected derivation nor the now incorrectly built derivation use rules abouth height.

Edit in case this cannot be reproduced later on: the minecraft-nlg commit I tested this with is 4da39a42cdfbf84f1c835e074fb0d99facbf67ec

I checked the derivation and the original tree is still derivable with the extended irtg, it still has the correct interpretations and it has the better score. So it seems like something weird is going on in the viterbi decoding.

This is the relevant snippet from my code:

var target = this.getTreeForInstruction(List.of("a", "row",  "in front of",  "the", "previous", "row"));
// ta is the tree automaton I use.  All rules have a weight of 0.9
bestTree = ta.viterbi();
// bestTree should be the tree with the highest possible score, but ...
System.out.print("accepts target: ");
System.out.println(ta.accepts(target)); // true
System.out.print("viterbi result tree weight: ");
System.out.println(ta.getWeight(bestTree)); // 0.313
System.out.print("target tree weight: ");
System.out.println(ta.getWeight(target)); // 0.38742

i.e. viterbi does not find the correct tree. I spoke with @jgroschwitz and he asked me whether a concrete ta works. And indeed, adding ta = ta.asConcreteTreeAutomaton(); at the top fixes the problem. There seems to be a problem with lazy evaluation somewhere.

I think there's something funky with the way rules are stored in lazy automata. On the one hand, the RuleStore class doesn't always work as I expect it to (I have more a general feeling than a concrete example though, unfortunately). On the other hand, whenever one implements a new TreeAutomaton class, I think one has to make sure that getRulesBottomUp and getRulesTopDown store the rules properly; but I am not aware of this being documented or really what the best practice should be.

Hi, I can't reproduce this; the test succeeds for me even after removing the comment lines, with both the current Git revision and with 4da3.

Could you write a self-contained unit test for Alto that exhibits the problem? (The grammar can live in its own file, we'll put it in the resources directory.)

@alexanderkoller test is now in a new branch: viterbi-derivation-test

Great, thanks. Now I can reproduce it.

Some preliminary observations:

  • One core problem is that the decomp automaton of SubsetAlgebra does not support top-down queries. This caused an error when InvHomAutomaton# called its makeAllRulesExplicit. Because it was apparently too easy to just ignore this error message, I changed it into an exception (on the branch).
  • The best tree of the concrete automaton is accepted by the original automaton, with the correct weight. Thus the problem is somewhere in the way viterbi accesses the rule store.
  • The issue uses a lot of badly maintained classes; not just InverseHomAutomaton (see #70), but also IntersectionAutomaton. If the original grammar could use nondeleting homomorphisms, everything would become neater and much faster.
  • The rule stores of both automata contain the same rules for bottom-up and top-down, so that can't really be the source of the discrepancy.
  • Calling makeAllRulesExplicit on the original automaton before calling viterbi does not help.