get_bev_features 函数中，有一个变量 bev_embed 被赋值为一个字符串类型，而ret_dict 实际上是一个张量,怎么办？

Question

get_bev_features 函数中，有一个变量 bev_embed 被赋值为一个字符串类型，而ret_dict 实际上是一个张量,怎么办？

Opened this issue 4 months ago · 1 comments

yuanryann commented 4 months ago

2024-06-14 09:46:00,707 - mmdet - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) CosineAnnealingLrUpdaterHook
(ABOVE_NORMAL) Fp16OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

before_train_epoch:
(VERY_HIGH ) CosineAnnealingLrUpdaterHook
(NORMAL ) DistSamplerSeedHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

before_train_iter:
(VERY_HIGH ) CosineAnnealingLrUpdaterHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook

after_train_iter:
(ABOVE_NORMAL) Fp16OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

after_train_epoch:
(NORMAL ) CheckpointHook
(NORMAL ) CustomDistEvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

before_val_epoch:
(NORMAL ) DistSamplerSeedHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

before_val_iter:
(LOW ) IterTimerHook

after_val_iter:
(LOW ) IterTimerHook

after_val_epoch:
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

after_run:
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook

2024-06-14 09:46:00,708 - mmdet - INFO - workflow: [('train', 1)], max: 24 epochs
2024-06-14 09:46:00,709 - mmdet - INFO - Checkpoints will be saved to /home/com0179/AI/MapTR/work_dirs/maptr_tiny_r50_24e by HardDiskBackend.
/home/com0179/AI/MapTR/projects/mmdet3d_plugin/models/utils/grid_mask.py:114: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:180.)
mask = torch.from_numpy(mask).to(x.dtype).cuda()
ret_dict: tensor([[[ 0.1889, -0.1615, 2.1812, ..., -1.0996, -1.4039, 0.6895],
[ 1.4819, -0.6015, 0.8437, ..., -0.6222, -0.7078, 0.7483],
[ 0.4329, 0.2124, 1.4304, ..., -1.9791, -0.9732, 0.6492],
...,
[-0.3204, -0.4688, 0.5317, ..., -1.9080, -0.5561, 0.6536],
[ 0.4254, -0.1113, 1.2542, ..., -1.9874, -0.6516, 1.0486],
[-0.2349, 0.8355, 0.9105, ..., -1.3129, 0.1006, 1.3759]],

    [[-0.2733,  0.0749,  0.9204,  ...,  0.9150, -0.3261,  0.0139],
     [ 1.3868, -0.3957,  0.8588,  ..., -1.4051, -0.0948,  0.3878],
     [ 0.8097,  0.7675,  0.6791,  ..., -0.4050, -0.3664, -0.3884],
     ...,
     [-1.0428, -0.7296,  0.3283,  ..., -2.0839, -0.6283,  1.3728],
     [-0.5850, -0.4228,  0.1651,  ..., -1.4061, -0.2002,  0.2984],
     [-0.8431,  1.0897,  0.4802,  ..., -1.9049, -0.2679,  1.8028]],

    [[ 0.7818, -0.6220,  1.4299,  ..., -1.4584, -2.0435,  0.2221],
     [ 1.0930, -0.2832,  0.5768,  ..., -0.3528, -0.5643,  0.1527],
     [ 0.7040, -0.0652,  1.5784,  ..., -1.1005, -0.4832, -0.1628],
     ...,
     [-0.7733, -1.2431,  0.6865,  ..., -2.4375, -0.8437,  1.2103],
     [-0.0844, -0.8666,  1.0173,  ..., -1.3839, -0.5428,  0.8602],
     [-0.2918,  0.1805,  0.2343,  ..., -0.1657, -0.3963,  1.7632]],

    [[ 0.8106,  0.2636,  1.1491,  ..., -0.6950, -0.6393,  0.6001],
     [ 1.6005, -0.2310,  1.1513,  ..., -0.4952, -0.2108,  0.5619],
     [ 0.4873,  0.1370,  0.7079,  ..., -0.9651, -0.5468,  0.6746],
     ...,
     [-0.8568, -1.1599,  0.2693,  ..., -2.6332, -1.6124,  1.2802],
     [ 0.1471,  0.2384,  0.8299,  ..., -1.7544, -0.6352,  1.3663],
     [ 0.3371,  1.3895,  0.4540,  ..., -1.4025, -0.7343,  1.7416]]],
   device='cuda:0', grad_fn=<NativeLayerNormBackward>)

Traceback (most recent call last):
File "./tools/train.py", line 259, in
main()
File "./tools/train.py", line 248, in main
custom_train_model(
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/bevformer/apis/train.py", line 27, in custom_train_model
custom_train_detector(
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/bevformer/apis/mmdet_train.py", line 199, in custom_train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
outputs = self.model.train_step(data_batch, self.optimizer,
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 52, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(**data)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/detectors/maptr.py", line 162, in forward
return self.forward_train(**kwargs)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
output = old_func(*new_args, **new_kwargs)
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/detectors/maptr.py", line 277, in forward_train
losses_pts = self.forward_pts_train(img_feats, lidar_feat, gt_bboxes_3d,
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/detectors/maptr.py", line 141, in forward_pts_train
outs = self.pts_bbox_head(
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
output = old_func(*new_args, **new_kwargs)
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/dense_heads/maptr_head.py", line 254, in forward
outputs = self.transformer(
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/modules/transformer.py", line 372, in forward
ouput_dic = self.get_bev_features(
File "/home/com0179/AI/MapTR/projects/mmdet3d_plugin/maptr/modules/transformer.py", line 267, in get_bev_features
if 'bev' in ret_dict:
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/_tensor.py", line 670, in contains
raise RuntimeError(
RuntimeError: Tensor.contains only supports Tensor or scalar, but you passed in a <class 'str'>.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1712135) of binary: /home/com0179/anaconda3/envs/MapTR/bin/python3
Traceback (most recent call last):
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/run.py", line 689, in run
elastic_launch(
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 116, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/com0179/anaconda3/envs/MapTR/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

    ./tools/train.py FAILED

=======================================

Answer 1 · 2024-06-20T07:38:42.000Z

遇到过相似的问题，注意maptr版本的对应，v2要用v2的脚本

2024-06-14 09:46:00,707 - mmdet - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) CosineAnnealingLrUpdaterHook (ABOVE_NORMAL) Fp16OptimizerHook (NORMAL ) CheckpointHook (NORMAL ) CustomDistEvalHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook

before_train_epoch: (VERY_HIGH ) CosineAnnealingLrUpdaterHook (NORMAL ) DistSamplerSeedHook (NORMAL ) CustomDistEvalHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook

before_train_iter: (VERY_HIGH ) CosineAnnealingLrUpdaterHook (NORMAL ) CustomDistEvalHook (LOW ) IterTimerHook

after_train_iter: (ABOVE_NORMAL) Fp16OptimizerHook (NORMAL ) CheckpointHook (NORMAL ) CustomDistEvalHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook

after_train_epoch: (NORMAL ) CheckpointHook (NORMAL ) CustomDistEvalHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook

before_val_epoch: (NORMAL ) DistSamplerSeedHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook

before_val_iter: (LOW ) IterTimerHook

after_val_iter: (LOW ) IterTimerHook

after_val_epoch: (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook

after_run: (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook