ValueError: You should supply an encoding or a list of encodings to this method that includes input_ids, but you provided []
gtanya89 opened this issue · 4 comments
gtanya89 commented
System Info
Using Trainer with PyTorch DDP on single node multiple GPU. torch.dist.init_process_group() is setup ok. Seems like Trainer _get_train_sampler() does not use DistributedSampler but rather RandomSampler? Or could this be another issue I am missing? Any inputs appreciated! Thanks!
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Error originates in data collator. Same code works with Single GPU.
Expected behavior
Expect to train in a distributed fashion on multiple GPUs using Trainer API.
amyeroberts commented
yuyemin commented
I'm also curious why removing DistributedSampler from _get_train_sampler() as I remember older version has it implemented for training with multi-gpu case.