训练问题,训练进行几次后,会出错,出现nan值,导致AssertionError!!!
Opened this issue · 1 comments
C:\Users\bxf\anaconda3\envs\transt\python.exe C:/PyCharmProjects/TransT-main/ltr/run_training.py
Training: transt transt
WARNING: You are using tensorboardX instead sis you have a too old pytorch version.
loading annotations into memory...
Done (t=13.20s)
creating index...
index created!
number of params: 23016006
No matching checkpoint file found
[train: 1, 1 / 1000] FPS: 0.0 (0.0) , Loss/total: 12.99988 , Loss/ce: 0.69430 , Loss/bbox: 0.97997 , Loss/giou: 1.15687 , iou: 0.03106
[train: 1, 2 / 1000] FPS: 0.0 (5.1) , Loss/total: 13.18990 , Loss/ce: 0.67882 , Loss/bbox: 1.01086 , Loss/giou: 1.23913 , iou: 0.01553
[train: 1, 3 / 1000] FPS: 0.0 (5.1) , Loss/total: 13.00681 , Loss/ce: 0.69773 , Loss/bbox: 0.93112 , Loss/giou: 1.26818 , iou: 0.01083
[train: 1, 4 / 1000] FPS: 0.0 (5.3) , Loss/total: 12.93164 , Loss/ce: 0.69913 , Loss/bbox: 0.91258 , Loss/giou: 1.27109 , iou: 0.01094
[train: 1, 5 / 1000] FPS: 0.0 (4.9) , Loss/total: 12.94410 , Loss/ce: 0.69589 , Loss/bbox: 0.91288 , Loss/giou: 1.29008 , iou: 0.00936
[train: 1, 6 / 1000] FPS: 0.0 (5.1) , Loss/total: 12.90344 , Loss/ce: 0.69371 , Loss/bbox: 0.90170 , Loss/giou: 1.30679 , iou: 0.00780
Training crashed at epoch 1
Traceback for the error!
Traceback (most recent call last):
File "C:\PyCharmProjects\TransT-main\ltr\trainers\base_trainer.py", line 70, in train
self.train_epoch() # 调用ltr/trainers/ltr_trainer.py写的train_epoch方法
File "C:\PyCharmProjects\TransT-main\ltr\trainers\ltr_trainer.py", line 79, in train_epoch
self.cycle_dataset(loader) # 调用自己写的cycle_dataset方法
File "C:\PyCharmProjects\TransT-main\ltr\trainers\ltr_trainer.py", line 60, in cycle_dataset
loss, stats = self.actor(data) # 跳转到ltr/actors/tracking.py里面
File "C:\PyCharmProjects\TransT-main\ltr\actors\tracking.py", line 44, in call
loss_dict = self.objective(outputs, targets) # 跳转到ltr/models/tracking/transt.py的182行的forward方法,用于计算损失
File "C:\Users\bxf\anaconda3\envs\transt\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\PyCharmProjects\TransT-main\ltr\models\tracking\transt.py", line 204, in forward
losses.update(self.get_loss(loss, outputs, targets, indices, num_boxes_pos))
File "C:\PyCharmProjects\TransT-main\ltr\models\tracking\transt.py", line 180, in get_loss
return loss_map[loss](outputs, targets, indices, num_boxes)
File "C:\PyCharmProjects\TransT-main\ltr\models\tracking\transt.py", line 153, in loss_boxes
box_ops.box_cxcywh_to_xyxy(target_boxes))
File "C:\PyCharmProjects\TransT-main\util\box_ops.py", line 52, in generalized_box_iou
assert (boxes1[:, 2:] >= boxes1[:, :2]).all()
AssertionError
请问一下解决了吗?请问如果想要自己训练的话,数据集路径和格式应该怎么放置?