qdrant/quaterion

Training error using multiple GPUs

Opened this issue · 0 comments

Problem

Error using multiple GPUs on the pl trainer.
Using the example available here https://github.com/qdrant/quaterion/blob/master/examples/train_cifar100.py and setting the devices param of pl.Trainer > 1, I received an error:

File "/home/stefano/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/quaterion/dataset/similarity_data_loader.py", line 271, in <listcomp> labels = {"groups": torch.LongTensor([record.group for record in batch])} AttributeError: 'tuple' object has no attribute 'group'

The error is present even with PairSimilarityDataLoader and using different training strategies (dp, or ddp)