microsoft/SoftTeacher

list index out of range

xiangtaowong opened this issue · 5 comments

train with full dataset, the 4000 iter is normal, afer 4000, it errors.

2022-05-10 17:35:33,142 - mmdet.ssod - INFO - Iter [3900/72000]	lr: 1.000e-04, eta: 8:14:16, time: 0.430, data_time: 0.008, memory: 1425, ema_momentum: 0.9990, unsup_loss_rpn_cls: 0.0486, unsup_loss_rpn_bbox: 0.3902, unsup_loss_cls: 0.2229, unsup_acc: 86.5625, unsup_loss_bbox: 0.0210, loss: 0.6827
2022-05-10 17:35:54,585 - mmdet.ssod - INFO - Iter [3950/72000]	lr: 1.000e-04, eta: 8:13:48, time: 0.429, data_time: 0.008, memory: 1425, ema_momentum: 0.9990, unsup_loss_rpn_cls: 0.0489, unsup_loss_rpn_bbox: 0.3866, unsup_loss_cls: 0.2291, unsup_acc: 86.8594, unsup_loss_bbox: 0.0126, loss: 0.6772
2022-05-10 17:36:16,040 - mmdet.ssod - INFO - Saving checkpoint at 4000 iterations
[                                                  ] 0/130, elapsed: 0s, ETA:
2022-05-10 17:36:17,310 - mmdet.ssod - INFO - Exp name: soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py


[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 130/130, 13.4 task/s, elapsed: 10s, ETA:     0s
Traceback (most recent call last):
  File "tools/train.py", line 199, in <module>
    main()
  File "tools/train.py", line 194, in main
    meta=meta,
  File "/home/wangxiangtao/code/SoftTeacher-main/ssod/apis/train.py", line 206, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/home/wangxiangtao/anaconda3/envs/soft/lib/python3.6/site-packages/mmcv/runner/iter_based_runner.py", line 133, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "/home/wangxiangtao/anaconda3/envs/soft/lib/python3.6/site-packages/mmcv/runner/iter_based_runner.py", line 66, in train
    self.call_hook('after_train_iter')
  File "/home/wangxiangtao/anaconda3/envs/soft/lib/python3.6/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/wangxiangtao/code/SoftTeacher-main/ssod/utils/hooks/submodules_evaluation.py", line 37, in after_train_iter
    self._do_evaluate(runner)
  File "/home/wangxiangtao/code/SoftTeacher-main/ssod/utils/hooks/submodules_evaluation.py", line 82, in _do_evaluate
    key_score = self.evaluate(runner, results, prefix=submodule)
  File "/home/wangxiangtao/code/SoftTeacher-main/ssod/utils/hooks/submodules_evaluation.py", line 110, in evaluate
    results, logger=runner.logger, **self.eval_kwargs
  File "/home/wangxiangtao/code/SoftTeacher-main/thirdparty/mmdetection/mmdet/datasets/coco.py", line 414, in evaluate
    result_files, tmp_dir = self.format_results(results, jsonfile_prefix)
  File "/home/wangxiangtao/code/SoftTeacher-main/thirdparty/mmdetection/mmdet/datasets/coco.py", line 359, in format_results
    result_files = self.results2json(results, jsonfile_prefix)
  File "/home/wangxiangtao/code/SoftTeacher-main/thirdparty/mmdetection/mmdet/datasets/coco.py", line 291, in results2json
    json_results = self._det2json(results)
  File "/home/wangxiangtao/code/SoftTeacher-main/thirdparty/mmdetection/mmdet/datasets/coco.py", line 228, in _det2json
    data['category_id'] = self.cat_ids[label]
IndexError: list index out of range

Here is my config of "soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py"

_base_="base.py"

data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(

        sup=dict(

            ann_file="data/coco/annotations/instances_train2017.json",
            img_prefix="data/coco/train2017/",
            classes=['0','1'],
        ),
        unsup=dict(

            ann_file="data/coco/annotations/instances_unlabeled2017.json",
            img_prefix="data/coco/unlabeled2017/",
            classes=['0','1'],
        ),
    ),
    #wxt: add val dataset
    val=dict(
            ann_file="data/coco/annotations/instances_val2017.json",
            img_prefix="data/coco/val2017/",
            classes=['0','1'],
    ),
    sampler=dict(
        train=dict(
            sample_ratio=[1, 1],
        )
    ),
)

semi_wrapper = dict(
    train_cfg=dict(
        unsup_weight=2.0,
    )
)

lr_config = dict(step=[12000 * 4, 160000 * 4])
runner = dict(_delete_=True, type="IterBasedRunner", max_iters=18000 * 4)


i guess there is something with the val dataset?

is something with the val dataset?

in case i only use data in training and testing , what will happens ?

hello,
did you solve this problem?
i've met the same one.

If you are trying to use coco format json to train your custom dataset and got this error. I have a solution and it work for me.
From your work_dir, you can get complete version of your training config(json file). Change num_of_class to your custom dataset class num. Use this config file to train a new task, and the problem will be solved