Adding example inputs and expected outputs
danieldeutsch opened this issue · 7 comments
Hi Ratish,
Sebastian Gehrmann mentioned supporting your dataset-specific evaluation the GEM metrics library. To do that, I will convert this codebase to a Docker container, add it to my Repro library, and then add it to the GEM library.
Could you provide some example inputs and expected outputs for this code? It will make it much easier for me to make sure that I've faithfully Dockerized your code.
Thanks!
I also don't know much about Lua, so any other specific information about the runtime will be very helpful (i.e., what versions of Lua, Python, specific python libraries, etc.).
Hi Daniel, thanks for helping!
The Torch version is 7. I had followed the steps in http://torch.ch/docs/getting-started.html to install lua torch http://torch.ch/docs/getting-started.html.
The list of instructions for evaluation is at https://github.com/ratishsp/data2text-macro-plan-py/blob/main/README_MLB.md#evaluation
The input to Step 2 python add_segment_marker.py -input_file $GEN/$IDENTIFIER-beam5_gens.txt -output_file \ $GEN/$IDENTIFIER-segment-beam5_gens.txt
is
test_gold.txt
For the command
python mlb_data_utils.py -mode prep_gen_data -gen_fi $GEN/$IDENTIFIER-segment-beam5_gens.txt \ -dict_pfx "$IE_ROOT/data/mlb-ie" -output_fi $DOC_GEN/transform_gen/$IDENTIFIER-beam5_gens.h5 \ -input_path "$IE_ROOT/json" \ -ordinal_inning_map_file $GEN/$IDENTIFIER-inning-map-beam5_gens.txt \ -test
The dict_pfx
files and input_path json are at https://drive.google.com/drive/folders/1q9xpjIBkF7YOerXE6eSiSDq158kjw8Nn
Note this python command requires Python 2.7
The output of the command th extractor.lua -gpuid 0 -datafile $IE_ROOT/data/mlb-ie.h5 \ -preddata $DOC_GEN/transform_gen/$IDENTIFIER-beam5_gens.h5 -dict_pfx \ "$IE_ROOT/data/mlb-ie" -just_eval -ignore_idx 14 -test
is https://github.com/ratishsp/mlb-ie/blob/master/test_mlb-beam5_gens.h5-tuples.txt
Hope it helps!