Training error after batch 616/5763 epoch 0/300
anisghaoui opened this issue · 1 comments
anisghaoui commented
HI,
I am trying to train the model as you mentioned it in the readme and for some reasons it crashes :
Traceback (most recent call last):
File "train.py", line 99, in <module>
loss, outputs = model(imgs, targets)
File "/home/anis/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/home/anis/Complex-YOLOv3/models.py", line 266, in forward
x, layer_loss = module[0](x, targets, img_dim)
File "/home/anis/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/home/anis/Complex-YOLOv3/models.py", line 190, in forward
ignore_thres=self.ignore_thres,
File "/home/anis/Complex-YOLOv3/utils/utils.py", line 375, in build_targets
best_ious, best_n = ious.max(0)
RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity
After looking for this error over the web, I found out that it might be the a missing/bad input. Any idea on why this would happen ?
anisghaoui commented
Ok, I went through the code and found that the data augmentation performed by the datasetloader might be faulty for some reasons :
in train.py :
# Get dataloader
dataset = KittiYOLODataset(
cnf.root_dir,
split='train',
mode='TRAIN',
folder='training',
data_aug=False, # problems occur if set to true
multiscale=opt.multiscale_training
)
This implies that the augmentation performs a transform on a data but, somehow, may not manage to do the same or simply misshape either the bounding boxes or key points.
Anyway, the training is now working.
Edit : added script name
Edit 2 : added ref links :
eriklindernoren/PyTorch-YOLOv3#110
feiyuhuahuo/Yolact_minimal#1