Can't train in colab

Question

Can't train in colab

HiImBug opened this issue 3 years ago · 2 comments

Hi thank you for sharing
I'm trying to train in colab but a have problem, somebody can help me pls?
I'm sure the folder structure as author.
I have tested in colab the following versions :
CUDA: 11.2
PyTorch 1.6.0
Python 3.8.5

!CUDA_VISIBLE_DEVICES="0" python tools/trainval.py configs/trainval/tinaface/tinaface_r50_fpn_bn.py

vedadet - WARNING - EvalHook is not in modes ['train']
vedadet - INFO - Loading weights from torchvision://resnet50
vedadet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: backbone.fc.weight, backbone.fc.bias

missing keys in source state_dict: neck.0.lateral_convs.0.conv.weight, neck.0.lateral_convs.0.bn.weight, neck.0.lateral_convs.0.bn.bias, neck.0.lateral_convs.0.bn.running_mean, neck.0.lateral_convs.0.bn.running_var, neck.0.lateral_convs.1.conv.weight, neck.0.lateral_convs.1.bn.weight, neck.0.lateral_convs.1.bn.bias, neck.0.lateral_convs.1.bn.running_mean, neck.0.lateral_convs.1.bn.running_var, neck.0.lateral_convs.2.conv.weight, neck.0.lateral_convs.2.bn.weight, neck.0.lateral_convs.2.bn.bias, neck.0.lateral_convs.2.bn.running_mean, neck.0.lateral_convs.2.bn.running_var, neck.0.lateral_convs.3.conv.weight, neck.0.lateral_convs.3.bn.weight, neck.0.lateral_convs.3.bn.bias, neck.0.lateral_convs.3.bn.running_mean, neck.0.lateral_convs.3.bn.running_var, neck.0.fpn_convs.0.conv.weight, neck.0.fpn_convs.0.bn.weight, neck.0.fpn_convs.0.bn.bias, neck.0.fpn_convs.0.bn.running_mean, neck.0.fpn_convs.0.bn.running_var, neck.0.fpn_convs.1.conv.weight, neck.0.fpn_convs.1.bn.weight, neck.0.fpn_convs.1.bn.bias, neck.0.fpn_convs.1.bn.running_mean, neck.0.fpn_convs.1.bn.running_var, neck.0.fpn_convs.2.conv.weight, neck.0.fpn_convs.2.bn.weight, neck.0.fpn_convs.2.bn.bias, neck.0.fpn_convs.2.bn.running_mean, neck.0.fpn_convs.2.bn.running_var, neck.0.fpn_convs.3.conv.weight, neck.0.fpn_convs.3.bn.weight, neck.0.fpn_convs.3.bn.bias, neck.0.fpn_convs.3.bn.running_mean, neck.0.fpn_convs.3.bn.running_var, neck.0.fpn_convs.4.conv.weight, neck.0.fpn_convs.4.bn.weight, neck.0.fpn_convs.4.bn.bias, neck.0.fpn_convs.4.bn.running_mean, neck.0.fpn_convs.4.bn.running_var, neck.0.fpn_convs.5.conv.weight, neck.0.fpn_convs.5.bn.weight, neck.0.fpn_convs.5.bn.bias, neck.0.fpn_convs.5.bn.running_mean, neck.0.fpn_convs.5.bn.running_var, neck.1.level_convs.0.0.conv.weight, neck.1.level_convs.0.0.bn.weight, neck.1.level_convs.0.0.bn.bias, neck.1.level_convs.0.0.bn.running_mean, neck.1.level_convs.0.0.bn.running_var, neck.1.level_convs.0.1.conv.weight, neck.1.level_convs.0.1.bn.weight, neck.1.level_convs.0.1.bn.bias, neck.1.level_convs.0.1.bn.running_mean, neck.1.level_convs.0.1.bn.running_var, neck.1.level_convs.0.2.conv.weight, neck.1.level_convs.0.2.bn.weight, neck.1.level_convs.0.2.bn.bias, neck.1.level_convs.0.2.bn.running_mean, neck.1.level_convs.0.2.bn.running_var, neck.1.level_convs.0.3.conv.weight, neck.1.level_convs.0.3.bn.weight, neck.1.level_convs.0.3.bn.bias, neck.1.level_convs.0.3.bn.running_mean, neck.1.level_convs.0.3.bn.running_var, neck.1.level_convs.0.4.conv.weight, neck.1.level_convs.0.4.bn.weight, neck.1.level_convs.0.4.bn.bias, neck.1.level_convs.0.4.bn.running_mean, neck.1.level_convs.0.4.bn.running_var, bbox_head.cls_convs.0.conv.weight, bbox_head.cls_convs.0.bn.weight, bbox_head.cls_convs.0.bn.bias, bbox_head.cls_convs.0.bn.running_mean, bbox_head.cls_convs.0.bn.running_var, bbox_head.cls_convs.1.conv.weight, bbox_head.cls_convs.1.bn.weight, bbox_head.cls_convs.1.bn.bias, bbox_head.cls_convs.1.bn.running_mean, bbox_head.cls_convs.1.bn.running_var, bbox_head.cls_convs.2.conv.weight, bbox_head.cls_convs.2.bn.weight, bbox_head.cls_convs.2.bn.bias, bbox_head.cls_convs.2.bn.running_mean, bbox_head.cls_convs.2.bn.running_var, bbox_head.cls_convs.3.conv.weight, bbox_head.cls_convs.3.bn.weight, bbox_head.cls_convs.3.bn.bias, bbox_head.cls_convs.3.bn.running_mean, bbox_head.cls_convs.3.bn.running_var, bbox_head.reg_convs.0.conv.weight, bbox_head.reg_convs.0.bn.weight, bbox_head.reg_convs.0.bn.bias, bbox_head.reg_convs.0.bn.running_mean, bbox_head.reg_convs.0.bn.running_var, bbox_head.reg_convs.1.conv.weight, bbox_head.reg_convs.1.bn.weight, bbox_head.reg_convs.1.bn.bias, bbox_head.reg_convs.1.bn.running_mean, bbox_head.reg_convs.1.bn.running_var, bbox_head.reg_convs.2.conv.weight, bbox_head.reg_convs.2.bn.weight, bbox_head.reg_convs.2.bn.bias, bbox_head.reg_convs.2.bn.running_mean, bbox_head.reg_convs.2.bn.running_var, bbox_head.reg_convs.3.conv.weight, bbox_head.reg_convs.3.bn.weight, bbox_head.reg_convs.3.bn.bias, bbox_head.reg_convs.3.bn.running_mean, bbox_head.reg_convs.3.bn.running_var, bbox_head.retina_cls.weight, bbox_head.retina_cls.bias, bbox_head.retina_reg.weight, bbox_head.retina_reg.bias, bbox_head.retina_iou.weight, bbox_head.retina_iou.bias

Traceback (most recent call last):
File "tools/trainval.py", line 65, in
main()
File "tools/trainval.py", line 61, in main
trainval(cfg, distributed, logger)
File "/content/drive/My Drive/Colab Notebooks/vedadet/vedadet/assembler/trainval.py", line 86, in trainval
looper.start(cfg.max_epochs)
File "/content/drive/My Drive/Colab Notebooks/vedadet/vedacore/loopers/epoch_based_looper.py", line 29, in start
self.epoch_loop(mode)
File "/content/drive/My Drive/Colab Notebooks/vedadet/vedacore/loopers/epoch_based_looper.py", line 15, in epoch_loop
for idx, data in enumerate(dataloader):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 359, in iter
return self._get_iterator()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 944, in init
self._reset(loader, first_iter=True)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 975, in _reset
self._try_put_index()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1209, in _try_put_index
index = self._next_index()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 512, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/sampler.py", line 226, in iter
for idx in self.sampler:
File "/content/drive/My Drive/Colab Notebooks/vedadet/vedadet/datasets/samplers/group_sampler.py", line 39, in iter
indices = np.concatenate(indices)
File "<array_function internals>", line 6, in concatenate
ValueError: need at least one array to concatenate

Answer 1 · 2021-10-05T14:44:01.000Z

I test so there is one folder image and annotations

Answer 2 · 2021-10-05T14:55:24.000Z

self.group_sizes = []
('self.flag', array([], dtype=int64))
what is self.flag = self.dataset.flag in line 91 /vedadet/vedadet/datasets/samplers/group_sampler.py
i can't find any field flag in file config data tinaface_r50_fpn_bn.py or tinaface_r50_fpn_gn_dcn.py
in File "/content/drive/My Drive/Colab Notebooks/vedadet/vedadet/datasets/samplers/group_sampler.py", line 41, in iter