Pinned issues
Issues
- 3
- 5
[Question] `add_generation_prompt=True` on prompt
#2346 opened by Galaxy-Husky - 1
RLOO Checkpoint Issue
#2342 opened by asparius - 0
Several problems in RLOOTrainer
#2316 opened by serendipity800 - 1
Whether chatglm3 6b is supported by trl ?
#2299 opened by fjy01 - 4
RLooTrainer bug when using deepspeed
#2329 opened by macheng6 - 2
Support for MiniCPM-V Reinforcement Learning with Direct Preference Optimization (DPO)
#2326 opened by DarioPTWR - 0
- 2
Accelerator package version problem
#2335 opened by littleshutong - 2
- 1
DPO Training DataLoader is not shuffled
#2337 opened by kaiwenw - 0
Difference between SFTTrainer and Seq2seqTrainer
#2339 opened by Hyfred - 7
- 9
trl dpo AttributeError: 'generator' object has no attribute 'generate'
#2292 opened by MonolithFoundation - 3
OOM when unwrap_model_for_generation
#2250 opened by hlnchen - 3
KTOTrainer Memory Leakage
#2268 opened by Isaaclgz - 0
Set gradient_checkpointing_kwargs in the yaml
#2334 opened by Galaxy-Husky - 1
Wrong tensor index for roll and truncate in DPOTrainer fn concatenated_forward( ).
#2330 opened by yanghh2000 - 3
AttributeError: 'TrainingArguments' object has no attribute 'model_init_kwargs'
#2291 opened by MonolithFoundation - 0
OOM when finetuning Llama3.2-90B on 8xA100 80GB
#2294 opened by maximilianmordig - 1
Tries to iterate over Modules of value_model, even when it is NoneType (the default value)
#2312 opened by ColinG03 - 1
- 1
Using a different `ref_model` from `model` leads to incorrect results
#2307 opened by DarshanDeshpande - 0
Gather metrics on All GPUs
#2315 opened by ziyuwan - 2
save `training_args` in another file format
#2313 opened by not-lain - 1
- 2
IndexError: pop from an empty deque
#2304 opened by c3ianwu - 1
Code migration suggestions
#2296 opened by MonolithFoundation - 1
- 1
- 6
Conflict between last version of Transformers.Trainer and DPOTrainer.get_batch_samples
#2275 opened by lucasdegeorge - 1
DataCollatorForCompletionOnlyLM not working
#2293 opened by SwayamInSync - 0
- 1
- 3
example/scripts/sft.py doesn't work
#2253 opened by Reductionreaction - 1
how much data needed when training 11B model, Llama-3.2-11B-Vision-Instruct?
#2269 opened by cutecharmingkid - 6
Add model merging callback
#2241 opened by lewtun - 0
- 0
wrong objective/entropy in RLOOTrainer
#2281 opened by serendipity800 - 3
During the execution of XPO, a 'tokenizer' KeyError suddenly occurred in callbacks.py
#2264 opened by ArcherShirou - 1
- 0
A problem with ppotrainer
#2267 opened by serendipity800 - 0
Helper function for getting reward model and judge
#2271 opened by qgallouedec - 1
multi-gpu training
#2256 opened by innat - 4
trl/examples/scripts /sft_vlm.py evaluation
#2254 opened by saxenarohit - 3
- 0
`LogCompletionsCallback` can't find the tokenizer
#2260 opened by qgallouedec - 2
- 2
Usage of `padding_free`
#2242 opened by zwhe99 - 2
TypeError: XPOTrainer.training_step() takes 3 positional arguments but 4 were given
#2247 opened by ArcherShirou