ratishsp/data2text-plan-py

file missing while evaluation

CarinaXZZ opened this issue · 9 comments

I try to use the automatic evaluation using IE metrics. There was an IOError: [Errno 2] No such file or directory: 'roto-ie.dict'.
I wonder where I can obtain this file.

And I wonder if there is any code provided to calculate the BLEU score?

Thanks.

The IE files can be obtained from the links in README of https://github.com/ratishsp/data2text-1.
For BLEU score, we use the moses script https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/multi-bleu.perl

Thanks for the information.
For BLEU score, I tried as follow.
perl tools/multi-bleu.perl gold_generation_fi < pred_generation_fi
What`s the gold file?

The gold file is the human written summaries from the https://github.com/harvardnlp/boxscore-data

How can I generate test files? The dataset provided doesn't contain test content plan, target, inf_src_test, pointers file...etc.

If I try to use scripts/create_dataset.py it requires ORACLE_IE_OUTPUT (roto_$MODE-beam5_gens.h5-tuples.txt) for test and valid dataset which I cant find. Are these the "gold" files (roto-gold-val.h5-tuples.txt)?

The test file input is src_test.txt
The test content plan will be generated by the model as part of stage 1 output.
The target test file is the human written summaries from https://github.com/harvardnlp/boxscore-data
Pointers file is only needed for training.

In that case, @ratishsp Can you share the script to create src_test/src_valid.txt from the json files

Hi @shubhamagarwal92 I have now slightly reorganized the folder structure:
the test files are in rotowire/test folder https://drive.google.com/drive/folders/1ODRFKxpsvzSw086pY1dysTj6SaG265VD
The validation files to be used during inference are available in rotowire folder itself. https://drive.google.com/drive/folders/1R_82ifGiybHKuXnVnC8JhBTW8BAkdwek
The files have prefix 'inf': inf_src_valid.txt and inf_tgt_valid.txt

Thanks for this. Can you also share the corresponding script (or point me to them) to create these files (src_test/inf_src_valid.txt).

Essentially the method

def box_preproc2(entry):

creates src_test/ inf_src_valid