/WikiTableT

Code, data, and pretrained models for the paper "Generating Wikipedia Article Sections from Diverse Data Sources"

Primary LanguagePythonMIT LicenseMIT

WikiTableT

Code, data, and pretrained models for the paper "Generating Wikipedia Article Sections from Diverse Data Sources"

Note: we refer to the section data as hyperlink data in both the processed json files and the codebase.

Resources

Dependencies

Usage

Tp train a new model, you may use a command similar to scripts/train_large_copy_cyc.sh.

To perform beam search generation using a trained model, you may use a command similar to scripts/generate_beam_search.sh. The process should generate 4 files including references. 2 of them are tokenized using NLTK for the convenience of latter evaluation steps.

If you want to generate your own version of reference data when computing the PARENT scores, use a command similar to scripts/convert2parent_dev.sh.

Once you have the generated file, you may evaluate it against the reference using the command scripts/eval_dev.sh REF_FILE_PATH GEN_FILE_PATH. Please make sure that you are using the tokenized files.

Acknowledgement

Part of the code in this repository is adapted from the following repositories: