ViTAE-Transformer/ViTPose

Assertion Error During Testing

dianchia opened this issue · 1 comments

Did you check docs and existing issues?

  • I have read all the docs
  • I have searched the existing issues

Version Information

>>> python -V
Python 3.7.12
mmpose 0.24.0
mmcv-full 1.3.9
Click for full version info
addict                   2.4.0
certifi                  2023.11.17
charset-normalizer       3.3.2
chumpy                   0.70
cycler                   0.11.0
Cython                   3.0.8
einops                   0.6.1
fonttools                4.38.0
idna                     3.6
importlib-metadata       6.7.0
json-tricks              3.17.3
kiwisolver               1.4.5
matplotlib               3.5.3
mmcv-full                1.3.9      $HOME/projects/pose_estimation/ViTPose/mmcv
mmpose                   0.24.0     $HOME/projects/pose_estimation/ViTPose/ViTPose
munkres                  1.1.4
numpy                    1.21.6
nvidia-cublas-cu11       11.10.3.66
nvidia-cuda-nvrtc-cu11   11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11        8.5.0.96
opencv-python            4.9.0.80
packaging                23.2
Pillow                   9.5.0
pip                      23.3.2
platformdirs             4.0.0
pyparsing                3.1.1
python-dateutil          2.8.2
PyYAML                   6.0.1
requests                 2.31.0
scipy                    1.7.3
setuptools               69.0.3
six                      1.16.0
timm                     0.4.9
tomli                    2.0.1
torch                    1.13.1
torchvision              0.14.1
typing_extensions        4.7.1
urllib3                  2.0.7
wheel                    0.42.0
xtcocotools              1.14.3
yapf                     0.40.2
zipp                     3.15.0

Operating System

Ubuntu

Describe the bug

AssertionError was raised then testing using the script tools/dist_test.sh. A shorter version of error is included below.

File "tools/test.py", line 184, in <module>
    main()
  File "tools/test.py", line 167, in main
    args.gpu_collect)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/apis/test.py", line 70, in multi_gpu_test
    result = model(return_loss=False, **data)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/mmcv/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 141, in forward
    img, img_metas, return_heatmap=return_heatmap, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 165, in forward_test
    assert img.size(0) == len(img_metas)
AssertionError
Click for full error message
$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py:188: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

  FutureWarning,
apex is not installed
apex is not installed
apex is not installed
$HOME/projects/pose_estimation/ViTPose/mmcv/mmcv/cnn/bricks/transformer.py:27: UserWarning: Fail to import ``MultiScaleDeformableAttention`` from ``mmcv.ops.multi_scale_deform_attn``, You should install ``mmcv-full`` if you need this module.
  warnings.warn('Fail to import ``MultiScaleDeformableAttention`` from '
$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/utils/setup_env.py:33: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  f'Setting OMP_NUM_THREADS environment variable for each process '
$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/utils/setup_env.py:43: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
  f'Setting MKL_NUM_THREADS environment variable for each process '
loading annotations into memory...
Done (t=1.00s)
creating index...
index created!
=> Total boxes: 104125
=> Total boxes after filter low score@0.0: 104125
=> num_images: 5000
=> load 104125 samples
Use load_from_local loader
The model and loaded state dict do not match exactly

unexpected key in source state_dict: backbone.blocks.0.mlp.experts.0.weight, backbone.blocks.0.mlp.experts.0.bias, backbone.blocks.0.mlp.experts.1.weight, backbone.blocks.0.mlp.experts.1.bias, backbone.blocks.0.mlp.experts.2.weight, backbone.blocks.0.mlp.experts.2.bias, backbone.blocks.0.mlp.experts.3.weight, backbone.blocks.0.mlp.experts.3.bias, backbone.blocks.0.mlp.experts.4.weight, backbone.blocks.0.mlp.experts.4.bias, backbone.blocks.0.mlp.experts.5.weight, backbone.blocks.0.mlp.experts.5.bias, backbone.blocks.1.mlp.experts.0.weight, backbone.blocks.1.mlp.experts.0.bias, backbone.blocks.1.mlp.experts.1.weight, backbone.blocks.1.mlp.experts.1.bias, backbone.blocks.1.mlp.experts.2.weight, backbone.blocks.1.mlp.experts.2.bias, backbone.blocks.1.mlp.experts.3.weight, backbone.blocks.1.mlp.experts.3.bias, backbone.blocks.1.mlp.experts.4.weight, backbone.blocks.1.mlp.experts.4.bias, backbone.blocks.1.mlp.experts.5.weight, backbone.blocks.1.mlp.experts.5.bias, backbone.blocks.2.mlp.experts.0.weight, backbone.blocks.2.mlp.experts.0.bias, backbone.blocks.2.mlp.experts.1.weight, backbone.blocks.2.mlp.experts.1.bias, backbone.blocks.2.mlp.experts.2.weight, backbone.blocks.2.mlp.experts.2.bias, backbone.blocks.2.mlp.experts.3.weight, backbone.blocks.2.mlp.experts.3.bias, backbone.blocks.2.mlp.experts.4.weight, backbone.blocks.2.mlp.experts.4.bias, backbone.blocks.2.mlp.experts.5.weight, backbone.blocks.2.mlp.experts.5.bias, backbone.blocks.3.mlp.experts.0.weight, backbone.blocks.3.mlp.experts.0.bias, backbone.blocks.3.mlp.experts.1.weight, backbone.blocks.3.mlp.experts.1.bias, backbone.blocks.3.mlp.experts.2.weight, backbone.blocks.3.mlp.experts.2.bias, backbone.blocks.3.mlp.experts.3.weight, backbone.blocks.3.mlp.experts.3.bias, backbone.blocks.3.mlp.experts.4.weight, backbone.blocks.3.mlp.experts.4.bias, backbone.blocks.3.mlp.experts.5.weight, backbone.blocks.3.mlp.experts.5.bias, backbone.blocks.4.mlp.experts.0.weight, backbone.blocks.4.mlp.experts.0.bias, backbone.blocks.4.mlp.experts.1.weight, backbone.blocks.4.mlp.experts.1.bias, backbone.blocks.4.mlp.experts.2.weight, backbone.blocks.4.mlp.experts.2.bias, backbone.blocks.4.mlp.experts.3.weight, backbone.blocks.4.mlp.experts.3.bias, backbone.blocks.4.mlp.experts.4.weight, backbone.blocks.4.mlp.experts.4.bias, backbone.blocks.4.mlp.experts.5.weight, backbone.blocks.4.mlp.experts.5.bias, backbone.blocks.5.mlp.experts.0.weight, backbone.blocks.5.mlp.experts.0.bias, backbone.blocks.5.mlp.experts.1.weight, backbone.blocks.5.mlp.experts.1.bias, backbone.blocks.5.mlp.experts.2.weight, backbone.blocks.5.mlp.experts.2.bias, backbone.blocks.5.mlp.experts.3.weight, backbone.blocks.5.mlp.experts.3.bias, backbone.blocks.5.mlp.experts.4.weight, backbone.blocks.5.mlp.experts.4.bias, backbone.blocks.5.mlp.experts.5.weight, backbone.blocks.5.mlp.experts.5.bias, backbone.blocks.6.mlp.experts.0.weight, backbone.blocks.6.mlp.experts.0.bias, backbone.blocks.6.mlp.experts.1.weight, backbone.blocks.6.mlp.experts.1.bias, backbone.blocks.6.mlp.experts.2.weight, backbone.blocks.6.mlp.experts.2.bias, backbone.blocks.6.mlp.experts.3.weight, backbone.blocks.6.mlp.experts.3.bias, backbone.blocks.6.mlp.experts.4.weight, backbone.blocks.6.mlp.experts.4.bias, backbone.blocks.6.mlp.experts.5.weight, backbone.blocks.6.mlp.experts.5.bias, backbone.blocks.7.mlp.experts.0.weight, backbone.blocks.7.mlp.experts.0.bias, backbone.blocks.7.mlp.experts.1.weight, backbone.blocks.7.mlp.experts.1.bias, backbone.blocks.7.mlp.experts.2.weight, backbone.blocks.7.mlp.experts.2.bias, backbone.blocks.7.mlp.experts.3.weight, backbone.blocks.7.mlp.experts.3.bias, backbone.blocks.7.mlp.experts.4.weight, backbone.blocks.7.mlp.experts.4.bias, backbone.blocks.7.mlp.experts.5.weight, backbone.blocks.7.mlp.experts.5.bias, backbone.blocks.8.mlp.experts.0.weight, backbone.blocks.8.mlp.experts.0.bias, backbone.blocks.8.mlp.experts.1.weight, backbone.blocks.8.mlp.experts.1.bias, backbone.blocks.8.mlp.experts.2.weight, backbone.blocks.8.mlp.experts.2.bias, backbone.blocks.8.mlp.experts.3.weight, backbone.blocks.8.mlp.experts.3.bias, backbone.blocks.8.mlp.experts.4.weight, backbone.blocks.8.mlp.experts.4.bias, backbone.blocks.8.mlp.experts.5.weight, backbone.blocks.8.mlp.experts.5.bias, backbone.blocks.9.mlp.experts.0.weight, backbone.blocks.9.mlp.experts.0.bias, backbone.blocks.9.mlp.experts.1.weight, backbone.blocks.9.mlp.experts.1.bias, backbone.blocks.9.mlp.experts.2.weight, backbone.blocks.9.mlp.experts.2.bias, backbone.blocks.9.mlp.experts.3.weight, backbone.blocks.9.mlp.experts.3.bias, backbone.blocks.9.mlp.experts.4.weight, backbone.blocks.9.mlp.experts.4.bias, backbone.blocks.9.mlp.experts.5.weight, backbone.blocks.9.mlp.experts.5.bias, backbone.blocks.10.mlp.experts.0.weight, backbone.blocks.10.mlp.experts.0.bias, backbone.blocks.10.mlp.experts.1.weight, backbone.blocks.10.mlp.experts.1.bias, backbone.blocks.10.mlp.experts.2.weight, backbone.blocks.10.mlp.experts.2.bias, backbone.blocks.10.mlp.experts.3.weight, backbone.blocks.10.mlp.experts.3.bias, backbone.blocks.10.mlp.experts.4.weight, backbone.blocks.10.mlp.experts.4.bias, backbone.blocks.10.mlp.experts.5.weight, backbone.blocks.10.mlp.experts.5.bias, backbone.blocks.11.mlp.experts.0.weight, backbone.blocks.11.mlp.experts.0.bias, backbone.blocks.11.mlp.experts.1.weight, backbone.blocks.11.mlp.experts.1.bias, backbone.blocks.11.mlp.experts.2.weight, backbone.blocks.11.mlp.experts.2.bias, backbone.blocks.11.mlp.experts.3.weight, backbone.blocks.11.mlp.experts.3.bias, backbone.blocks.11.mlp.experts.4.weight, backbone.blocks.11.mlp.experts.4.bias, backbone.blocks.11.mlp.experts.5.weight, backbone.blocks.11.mlp.experts.5.bias

[                                                  ] 0/104125, elapsed: 0s, ETA:Traceback (most recent call last):
  File "tools/test.py", line 184, in <module>
    main()
  File "tools/test.py", line 167, in main
    args.gpu_collect)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/apis/test.py", line 70, in multi_gpu_test
    result = model(return_loss=False, **data)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/mmcv/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 141, in forward
    img, img_metas, return_heatmap=return_heatmap, **kwargs)
  File "$HOME/projects/pose_estimation/ViTPose/ViTPose/mmpose/models/detectors/top_down.py", line 165, in forward_test
    assert img.size(0) == len(img_metas)
AssertionError
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 3740731) of binary: $HOME/miniforge3/envs/vitpose/bin/python
Traceback (most recent call last):
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py", line 195, in <module>
    main()
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py", line 191, in main
    launch(args)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launch.py", line 176, in launch
    run(args)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/run.py", line 756, in run
    )(*cmd_args)
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "$HOME/miniforge3/envs/vitpose/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 248, in launch_agent
    failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
tools/test.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-02-12_17:58:43
  host      : host
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 3740731)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Steps to reproduce

  1. Clone the repository with git clone https://github.com/ViTAE-Transformer/ViTPose.git --depth 1
  2. Follow the installation instruction in README.md
  3. Download dataset from coco-dataset official website. To be specific, the 2017 Train/Val/Test Images.
  4. Put the downloaded images into ./data/coco/ and unzip all of them.
  5. Download the annotation files from here and put it into ./data/coco/annotations/
  6. Download any of the wholebody pretrained model
  7. Start testing with this command bash tools/dist_test.sh configs/wholebody/2d_kpt_sview_rgb_img/topdown_heatmap/coco-wholebody/ViTPose_base_wholebody_256x192.py pretrained/wholebody.pth 1

Expected behaviour

Expected the testing to run smoothly without errors.

I correct this error by setting all "img_metas" to "img_metas.data[0]" in ./detectors/top_down.py