
Primary LanguagePython


Dataset & evaluation script for ADL 2022 homework 3


download link


git clone https://github.com/zqyuan-tw/ADL22-HW3.git
cd ADL22-HW3
pip install -e tw_rouge



The training script is simply modified from huggingface summarization. Therefore, we have to convert our input .jsonl files to the required .csv format.

python jsonl_to_csv.py <-i json_file> <-o csv_file>
[CUDA_VISIBLE_DEVICES=0,1,...] python run_summarization.py <--model_name_or_path model> [<--do_train> <--train_file csv_file>] [<--do_eval> <--validation_file public_csv>] [--text_column text] [--summary_column summary] <--output_dir dir> [--warmup_ratio ratio] [--num_train_epochs epoch]
  • -i, --input_jsonl: Path to the jsonl file input.
  • -o, --output_csv: Path to the csv file output.
  • CUDA_VISIBLE_DEVICES=0: Training with GPU.
  • --model_name_or_path: Path to pretrained model or model identifier from huggingface.co/models (default: None)
  • --do_train: Whether to run training. (default: False)
  • --train_file: The input training data file (a json or csv file). (default: None)
  • --do_eval: Whether to run eval on the dev set. (default: False)
  • --validation_file: The input validation data file (a json or csv file). (default: None)
  • --text_column: The name of the column in the datasets containing the full texts (for summarization). (default: None)
  • --summary_column: The name of the column in the datasets containing the summaries (for summarization). (default: None)
  • --output_dir: The output directory where the model predictions and checkpoints will be written. (default: None)
  • --warmup_ratio: Linear warmup over warmup_ratio fraction of total steps. (default: 0.0)
  • --num_train_epochs: Total number of training epochs to perform. (default: 3.0)


python predict.py <-a test_file> <-s result_file> <-m model> [--max_length length] [--greedy] [--beam num] [--top_k k] [--top_p p] [--temperature t]
  • -a, --article: Path to the article jsonl file.
  • -s, --summary: Path to the summary jsonl file.
  • -m, --model: Path to the model folder.
  • --max_length: Max output length. (default: 128)
  • --greedy: Greedy search. (default: None)
  • --beam: Beam search. (default: None)
  • --top_k: Top-k Sampling. (default: None)
  • --top_p: Top-p Sampling. (default: None)
  • --temperature: Temperature. (default: None)


Use the Script

usage: eval.py [-h] [-r REFERENCE] [-s SUBMISSION]

optional arguments:
  -h, --help            show this help message and exit
  -r REFERENCE, --reference REFERENCE
  -s SUBMISSION, --submission SUBMISSION


python eval.py -r public.jsonl -s submission.jsonl
  "rouge-1": {
    "f": 0.21999419163162043,
    "p": 0.2446195813913345,
    "r": 0.2137398792982201
  "rouge-2": {
    "f": 0.0847583291303246,
    "p": 0.09419044877345074,
    "r": 0.08287844474014894
  "rouge-l": {
    "f": 0.21017939117006337,
    "p": 0.25157090570020846,
    "r": 0.19404349000921203

Use Python Library

>>> from tw_rouge import get_rouge
>>> get_rouge('我是人', '我是一個人')
{'rouge-1': {'f': 0.7499999953125, 'p': 1.0, 'r': 0.6}, 'rouge-2': {'f': 0.33333332888888895, 'p': 0.5, 'r': 0.25}, 'rouge-l': {'f': 0.7499999953125, 'p': 1.0, 'r': 0.6}}
>>> get_rouge(['我是人'], [ '我是一個人'])
{'rouge-1': {'f': 0.7499999953125, 'p': 1.0, 'r': 0.6}, 'rouge-2': {'f': 0.33333332888888895, 'p': 0.5, 'r': 0.25}, 'rouge-l': {'f': 0.7499999953125, 'p': 1.0, 'r': 0.6}}
>>> get_rouge(['我是人'], ['我是一個人'], avg=False)
[{'rouge-1': {'f': 0.7499999953125, 'p': 1.0, 'r': 0.6}, 'rouge-2': {'f': 0.33333332888888895, 'p': 0.5, 'r': 0.25}, 'rouge-l': {'f': 0.7499999953125, 'p': 1.0, 'r': 0.6}}]

