huggingface/trl

`LogCompletionsCallback` can't find the tokenizer

qgallouedec opened this issue · 0 comments

System Info

  • Platform: Linux-5.15.0-1048-aws-x86_64-with-glibc2.31
  • Python version: 3.11.9
  • PyTorch version: 2.4.1
  • CUDA device(s): NVIDIA H100 80GB HBM3
  • Transformers version: 4.46.0.dev0
  • Accelerate version: 1.0.0
  • Accelerate config: not found
  • Datasets version: 3.0.1
  • HF Hub version: 0.24.7
  • TRL version: 0.12.0.dev0+96c814e
  • bitsandbytes version: 0.41.1
  • DeepSpeed version: 0.15.2
  • Diffusers version: 0.30.3
  • Liger-Kernel version: 0.3.0
  • LLM-Blender version: 0.0.2
  • OpenAI version: 1.46.0
  • PEFT version: 0.13.2

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

accelerate launch examples/scripts/dpo_online.py --model_name_or_path trl-lib/pythia-1b-deduped-tldr-sft  --reward_model_path trl-lib/pythia-1b-deduped-tldr-rm --dataset_name trl-lib/tldr --learning_rate 5.0e-7 --output_dir pythia-1b-tldr-online-dpo-reward --warmup_ratio 0.1
The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_processes` was set to a value of `1`
        `--num_machines` was set to a value of `1`
        `--mixed_precision` was set to a value of `'no'`
        `--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
[2024-10-21 15:00:47,661] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
config.json: 100%|████████████████████████████████████████████████████████████████████████████████| 818/818 [00:00<00:00, 8.98MB/s]
pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████▉| 3.64G/3.64G [00:07<00:00, 469MB/s]
wandb: WARNING The `run_name` is currently set to the same value as `TrainingArguments.output_dir`. If this was not intended, please specify a different run name by setting the `TrainingArguments.run_name` parameter.
wandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
wandb: Currently logged in as: qgallouedec (huggingface). Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.18.0
wandb: Run data is saved locally in /fsx/qgallouedec/trl/wandb/run-20241021_150106-rq5w2xvm
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run pythia-1b-tldr-online-dpo-reward
wandb: ⭐️ View project at https://wandb.ai/huggingface/huggingface
wandb: 🚀 View run at https://wandb.ai/huggingface/huggingface/runs/rq5w2xvm
  0%|                                                                                                    | 0/43773 [00:00<?, ?it/s]Could not estimate the number of tokens of the input, floating-point operations will not be computed
model.safetensors: 100%|██████████████████▉| 3.64G/3.64G [00:12<00:00, 302MB/s]
  1%|▌                                                                                    1%|▍                                           | 477/43773 [22:18<33:34:45,  2.79s/it]  1%|▌                                           | 500/43773 [23:23<33:36:16,  2.80s/it]
Traceback (most recent call last):
  File "/fsx/qgallouedec/trl/examples/scripts/dpo_online.py", line 128, in <module>
    generation_config = GenerationConfig(
    ^^^^^^^^^^^^^^^
  File "/fsx/qgallouedec/transformers/src/transformers/trainer.py", line 2112, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/fsx/qgallouedec/transformers/src/transformers/trainer.py", line 2533, in _inner_training_loop
    self.control = self.callback_handler.on_step_end(args, self.state, self.control)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/qgallouedec/transformers/src/transformers/trainer_callback.py", line 496, in on_step_end
    return self.call_event("on_step_end", args, state, control)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/qgallouedec/transformers/src/transformers/trainer_callback.py", line 518, in call_event
    result = getattr(callback, event)(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/qgallouedec/trl/trl/trainer/callbacks.py", line 404, in on_step_end
    tokenizer = kwargs["tokenizer"]
                ~~~~~~^^^^^^^^^^^^^
KeyError: 'tokenizer'

Expected behavior

not to fail