Reproduction results are inconsistent.

Question

Reproduction results are inconsistent.

Closed this issue a year ago · 8 comments

Hi, I am interested in your paper. But I encountered a problem when I tried to reproduce your experiment. The replication results are inconsistent with the results mentioned in the paper, and the latter is lower than the former by about six points in ROUGE. I am confused and want to know if you have any details that I may have missed.

Answer 1 · 2023-12-12T05:21:32.000Z

Hi @Hz7even,
Sorry for the late reply, been in a conference the other day. The results you report is not expected, could you please further provide the log file and hardware you use? Thanks so much!

Answer 2 · 2023-12-16T04:07:31.000Z

logfile.log
package.txt
all_results.json

Thanks for your reply! I have attached the log file, the package I used, the training results and the information about the CUDA.

Answer 3 · 2023-12-16T08:35:35.000Z

Hi, could you please attach the training loss and validation rouge score in the log file? We just ran the model for 0.5 epoch and already got ROUGE-1 around 50.

{
  "best_metric": 50.6692,
  "best_model_checkpoint": "../results/bartlarge/samsum/running/checkpoint-1000",
  "epoch": 0.543072432285656,
  "global_step": 1000,
  "is_hyper_param_search": false,
  "is_local_process_zero": true,
  "is_world_process_zero": true,
  "log_history": [
    {
      "epoch": 0.0,
      "learning_rate": 0.0,
      "loss": 14.5177,
      "step": 1
    },
    {
      "epoch": 0.05,
      "learning_rate": 1.2666666666666667e-05,
      "loss": 8.6063,
      "step": 100
    },
    {
      "epoch": 0.11,
      "learning_rate": 1.996723102129984e-05,
      "loss": 3.961,
      "step": 200
    },
    {
      "epoch": 0.16,
      "learning_rate": 1.9894411068632807e-05,
      "loss": 3.5681,
      "step": 300
    },
    {
      "epoch": 0.22,
      "learning_rate": 1.982159111596578e-05,
      "loss": 3.4018,
      "step": 400
    },
    {
      "epoch": 0.27,
      "learning_rate": 1.9748771163298747e-05,
      "loss": 3.3263,
      "step": 500
    },
    {
      "epoch": 0.27,
      "eval_gen_len": 14.3472,
      "eval_loss": 3.1043131351470947,
      "eval_rouge-1": 43.4403,
      "eval_rouge-2": 22.8002,
      "eval_rouge-l": 46.4587,
      "eval_runtime": 84.6682,
      "eval_samples_per_second": 9.661,
      "eval_steps_per_second": 0.815,
      "step": 500
    },
    {
      "epoch": 0.33,
      "learning_rate": 1.9675951210631715e-05,
      "loss": 3.2717,
      "step": 600
    },
    {
      "epoch": 0.38,
      "learning_rate": 1.9603131257964684e-05,
      "loss": 3.1921,
      "step": 700
    },
    {
      "epoch": 0.43,
      "learning_rate": 1.9530311305297652e-05,
      "loss": 3.1275,
      "step": 800
    },
    {
      "epoch": 0.49,
      "learning_rate": 1.945749135263062e-05,
      "loss": 3.1457,
      "step": 900
    },
    {
      "epoch": 0.54,
      "learning_rate": 1.9384671399963593e-05,
      "loss": 3.0904,
      "step": 1000
    },
    {
      "epoch": 0.54,
      "eval_gen_len": 22.7787,
      "eval_loss": 2.95542311668396,
      "eval_rouge-1": 50.6692,
      "eval_rouge-2": 26.9816,
      "eval_rouge-l": 52.4963,
      "eval_runtime": 128.4048,
      "eval_samples_per_second": 6.37,
      "eval_steps_per_second": 0.537,
      "step": 1000
    }
  ],
  "max_steps": 27615,
  "num_train_epochs": 15,
  "total_flos": 1.2293455883523072e+16,
  "trial_name": null,
  "trial_params": null
}

Answer 4 · 2023-12-17T13:49:42.000Z

Sorry to bother you, but where is this log file? Or do I need to write my own code and output it?
Beacuse I couldn't find the corresponding file in the result folder.

Answer 5 · 2023-12-17T14:57:12.000Z

Our Trainer is implemented based on the Huggingface/Trainer where the log file would be created automatically by Trainer in the result folder.

Answer 6 · 2023-12-17T16:00:25.000Z

Sorry for my carelessness, I found it in the result folder finally. Hopefully it is the correct file and thanks for your time sincerely!
trainer_state.json

Answer 7 · 2023-12-19T06:51:54.000Z

How many GPUs do you use for training? And could you please further confirm the base model you use is bart-large?

Answer 8 · 2023-12-21T14:28:48.000Z

One GPU. And your words made me realize that the base model I use is bart-base.
I rerun it and now it is consistent with the results mentioned in the paper.
Tnanks for your time and patience again!