mynlp/ccg2lambda

Selective Jigg kbest parsing

Closed this issue · 3 comments

When running Jigg with a k-best option like -ccg.kBest 10 -file, the system tries to generate a *.sem.xml file for all the k-best parses. So, if a non-tree is contained in the k-best parses, the system fails to generate a *.sem.xml file. For instance, if you run

echo "誰かがパンを食べた。" > test.txt
./ja/rte_ja.sh test.txt ja/semantic_templates_ja_event.yaml

you would get an error:

$ cat ja_parsed/test.txt.sem.err
Traceback (most recent call last):
  File "scripts/semparse.py", line 126, in <module>
    main()
  File "scripts/semparse.py", line 86, in main
    sentence, semantic_index, tree_index)
  File "/Users/kojimineshima/ccg2lambda/scripts/ccg2lambda_tools.py", line 98, in assign_semantics_to_ccg
    ccg_tree = build_ccg_tree(ccg_flat_tree)
  File "/Users/kojimineshima/ccg2lambda/scripts/ccg2lambda_tools.py", line 40, in build_ccg_tree
    root_span = copy.deepcopy(semantic_index.find_node_by_id(root_id, ccg_xml))
  File "/Users/kojimineshima/ccg2lambda/scripts/semantic_index.py", line 133, in find_node_by_id
    raise(ValueError('It should have found a span for id {0}'.format(node_id)))
ValueError: It should have found a span for id s0_sp75 s0_sp80 s0_sp87

although the top 5 trees are well-formed.

<ccg score="954.9721231460571" id="s0_ccg0" root="s0_sp0"></ccg>
<ccg score="831.0679993629456" id="s0_ccg1" root="s0_sp15"></ccg>
<ccg score="801.8002367019653" id="s0_ccg2" root="s0_sp30"></ccg>
<ccg score="691.1770849227905" id="s0_ccg3" root="s0_sp45"></ccg>
<ccg score="681.8826441764832" id="s0_ccg4" root="s0_sp60"></ccg>
<ccg score="679.694173336029" id="s0_ccg5" root="s0_sp75 s0_sp80 s0_sp87">

The system should ignore such a non-tree like id="s0_ccg5" root="s0_sp75 s0_sp80 s0_sp87" when generating sem files.

Thank you very much for reporting this issue.

I think I have fixed that issue in a different branch. Can you:

git checkout theorem

and see if the problem is solved according to your expectations?

Yes, it has been already fixed in the theorem branch. Thank you very much!
When are you going to merge it to the master branch?

Thank you very much for checking the theorem branch! I will merge as soon as we confirm that there are no unexpected behaviours in master. Thank you!