init prefetcher, this might take one minute or less...

Question

init prefetcher, this might take one minute or less...

Opened this issue a year ago · 1 comments

hello author

我按照https://zhuanlan.zhihu.com/p/430850089 的步骤走到了模型训练这一步，但是训练脚本跑起来后，一直卡在 “init prefetcher, this might take one minute or less...” ，之后排查了源码发现程序卡在
yolox/data/data_prefetcher.py
26行
self.next_input, self.next_target, _, _ = next(self.loader)
我不确定是不是我的配置参数出了问题，请作者给与建议或帮助

如下是跑起来的打印信息：
(yolox_obb) zibai@eng2:~/opt/YOLOX_OBBG$ bash my_exps/train.sh MEF-G exps/example/yolox_obb/yolox_s_MFE-G.py 0 1 16 --fp16

activate env yolox_obb

Current dir is /home/zibai/opt/YOLOX_OBBG
exp is exps/example/yolox_obb/yolox_s_MFE-G.py
cuda_device is cuda: 0
num_device is 1
batch_size is 16
pth is
other args: --fp16
ready train ....
2023-07-19 09:06:31 | INFO | yolox.core.trainer:131 - args: Namespace(batch_size=16, cache=False, ckpt=None, devices=1, dist_backend='nccl', dist_url=None, exp_file='exps/example/yolox_obb/yolox_s_MFE-G.py', experiment_name='MEF-G', fp16=True, machine_rank=0, name=None, num_machines=1, occupy=False, options=None, resume=False, start_epoch=None)
2023-07-19 09:06:31 | INFO | yolox.core.trainer:132 - exp value:
╒═════════════════════╤═══════════════════════════════════════════════════════╕
│ keys │ values │
╞═════════════════════╪═══════════════════════════════════════════════════════╡
│ seed │ None │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ output_dir │ 'YOLOX_outputs' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ print_interval │ 10 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ eval_interval │ 10 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ modules_config │ 'configs/modules/yoloxs_obb.yaml' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ losses_config │ 'configs/losses/yolox_losses_obb.yaml' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ dataset_config │ 'configs/datasets/MFE-G.yaml' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ data_num_workers │ 4 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ input_size │ (1024, 1024) │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ multiscale_range │ 5 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ mosaic_prob │ 1.0 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ mixup_prob │ 0.0 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ hsv_prob │ 1.0 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ flip_prob │ 0.5 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ degrees │ 10.0 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ translate │ 0.1 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ mosaic_scale │ (0.4, 1.2) │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ mixup_scale │ (0.4, 1.2) │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ shear │ 2.0 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ enable_mixup │ True │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ warmup_epochs │ 1 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ max_epoch │ 500 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ warmup_lr │ 0 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ basic_lr_per_img │ 0.00015625 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ scheduler │ 'yoloxwarmcos' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ no_aug_epochs │ 20 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ min_lr_ratio │ 0.05 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ ema │ True │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ no_eval │ False │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ weight_decay │ 0.0005 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ momentum │ 0.9 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ exp_name │ 'MEF-G' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ test_size │ (1024, 1024) │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ postprocess_cfg │ {'conf_thre': 0.05, 'nms_thre': 0.1} │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ copy_paste_prob │ 1.0 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ enable_debug │ False │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ enable_resample │ True │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ aug_ignore │ None │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ empty_ignore │ True │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ long_wh_thre │ 6 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ short_wh_thre │ 3 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ overlaps_thre │ 0.6 │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ evaluate_cfg │ {'is_merge': False, 'is_submiss': False, 'nproc': 10} │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ export_input_names │ ['input'] │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ export_output_names │ ['boxes', 'scores', 'class'] │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ include_post │ True │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ data_dir │ 'datasets/MFE-G/Bbox' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ train_ann │ 'train' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ val_ann │ 'val' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ test_ann │ 'test' │
├─────────────────────┼───────────────────────────────────────────────────────┤
│ num_classes │ 8 │
╘═════════════════════╧═══════════════════════════════════════════════════════╛
2023-07-19 09:06:31 | INFO | yolox.models.parse_model:18 - overriding modules.yaml num_classes=80 with num_classes=8
/home/zibai/opt/anaconda3/envs/yolox_obb/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180594101/work/aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
2023-07-19 09:06:31 | INFO | yolox.core.trainer:138 - Model Summary: Params: 8.05M, Gflops: 55.81
2023-07-19 09:06:33 | INFO | yolox.core.trainer:156 - init prefetcher, this might take one minute or less...

Answer 1 · 2023-07-19T10:13:16.000Z

│ empty_ignore │ True │ -> empty_ignore False zibai4991 ***@***.***> 于2023年7月19日周三 09:23写道：

…

hello author 我按照https://zhuanlan.zhihu.com/p/430850089 的步骤走到了模型训练这一步，但是训练脚本跑起来后，一直卡在 “init prefetcher, this might take one minute or less...” ，之后排查了源码发现程序卡在 yolox/data/data_prefetcher.py 26行 self.next_input, self.next_target, _, _ = next(self.loader) 我不确定是不是我的配置参数出了问题，请作者给与建议或帮助如下是跑起来的打印信息： (yolox_obb) ***@***.***:~/opt/YOLOX_OBBG$ bash my_exps/train.sh MEF-G exps/example/yolox_obb/yolox_s_MFE-G.py 0 1 16 --fp16 ------------------------------ activate env yolox_obb ------------------------------ Current dir is /home/zibai/opt/YOLOX_OBBG exp is exps/example/yolox_obb/yolox_s_MFE-G.py cuda_device is cuda: 0 num_device is 1 batch_size is 16 pth is other args: --fp16 ready train .... 2023-07-19 09:06:31 | INFO | yolox.core.trainer:131 - args: Namespace(batch_size=16, cache=False, ckpt=None, devices=1, dist_backend='nccl', dist_url=None, exp_file='exps/example/yolox_obb/yolox_s_MFE-G.py', experiment_name='MEF-G', fp16=True, machine_rank=0, name=None, num_machines=1, occupy=False, options=None, resume=False, start_epoch=None) 2023-07-19 09:06:31 | INFO | yolox.core.trainer:132 - exp value: ╒═════════════════════╤═══════════════════════════════════════════════════════╕ │ keys │ values │ ╞═════════════════════╪═══════════════════════════════════════════════════════╡ │ seed │ None │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ output_dir │ 'YOLOX_outputs' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ print_interval │ 10 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ eval_interval │ 10 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ modules_config │ 'configs/modules/yoloxs_obb.yaml' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ losses_config │ 'configs/losses/yolox_losses_obb.yaml' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ dataset_config │ 'configs/datasets/MFE-G.yaml' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ data_num_workers │ 4 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ input_size │ (1024, 1024) │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ multiscale_range │ 5 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ mosaic_prob │ 1.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ mixup_prob │ 0.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ hsv_prob │ 1.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ flip_prob │ 0.5 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ degrees │ 10.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ translate │ 0.1 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ mosaic_scale │ (0.4, 1.2) │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ mixup_scale │ (0.4, 1.2) │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ shear │ 2.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ enable_mixup │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ warmup_epochs │ 1 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ max_epoch │ 500 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ warmup_lr │ 0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ basic_lr_per_img │ 0.00015625 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ scheduler │ 'yoloxwarmcos' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ no_aug_epochs │ 20 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ min_lr_ratio │ 0.05 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ ema │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ no_eval │ False │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ weight_decay │ 0.0005 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ momentum │ 0.9 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ exp_name │ 'MEF-G' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ test_size │ (1024, 1024) │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ postprocess_cfg │ {'conf_thre': 0.05, 'nms_thre': 0.1} │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ copy_paste_prob │ 1.0 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ enable_debug │ False │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ enable_resample │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ aug_ignore │ None │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ empty_ignore │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ long_wh_thre │ 6 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ short_wh_thre │ 3 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ overlaps_thre │ 0.6 │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ evaluate_cfg │ {'is_merge': False, 'is_submiss': False, 'nproc': 10} │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ export_input_names │ ['input'] │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ export_output_names │ ['boxes', 'scores', 'class'] │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ include_post │ True │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ data_dir │ 'datasets/MFE-G/Bbox' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ train_ann │ 'train' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ val_ann │ 'val' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ test_ann │ 'test' │ ├─────────────────────┼───────────────────────────────────────────────────────┤ │ num_classes │ 8 │ ╘═════════════════════╧═══════════════════════════════════════════════════════╛ 2023-07-19 09:06:31 | INFO | yolox.models.parse_model:18 - overriding modules.yaml num_classes=80 with num_classes=8 /home/zibai/opt/anaconda3/envs/yolox_obb/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180594101/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] 2023-07-19 09:06:31 | INFO | yolox.core.trainer:138 - Model Summary: Params: 8.05M, Gflops: 55.81 2023-07-19 09:06:33 | INFO | yolox.core.trainer:156 - init prefetcher, this might take one minute or less... — Reply to this email directly, view it on GitHub <#42>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APFM3ABQB4YHNECVEJZCZ7DXQ4ZITANCNFSM6AAAAAA2PE4RR4> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>