Concat layer failed when running on GPU
guyzsarun opened this issue · 1 comments
guyzsarun commented
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04.4 LTS
- NDK version(e.g., 15c): 19c
- GCC version(if compiling for host, e.g., 5.4.0): 5.3.0
- MACE version (Use the command: git describe --long --tags): 1.0.0
- Bazel version (e.g., 0.13.0): 0.16.0
Model deploy file (*.yml)
library_name: repvgg
model_data_format: code
model_graph_format: code
target_abis: [arm64-v8a, armeabi-v7a]
models:
repvgg:
model_file_path: ../models/model/a0.onnx
model_sha256_checksum: 4f760a4527e83eb4566ec41857e8d77b2fa4b4f09e71d76a23e473e4a534fcd1
obfuscate: 0
platform: onnx
runtime: cpu+gpu
subgraphs:
- input_data_formats:
- NCHW
input_shapes:
- 1,3,112,112
input_tensors:
- input.1
output_data_formats:
- NCHW
output_shapes:
- 1,1000
output_tensors:
- 100
winograd: 2
Describe the problem
Model cannot run on GPU with Check failed: shape.size() == 4 GPU only support 2D/4D input
To Reproduce
Steps to reproduce the problem:
1. cd /path/to/mace
2. python3 tools/converter.py convert --config_file=/path/to/your/model_deployment_file
3. python3 tools/converter.py run --config_file=/path/to/your/model_deployment_file
Error information / logs
Please include the full log and/or traceback here.
CMD> bazel version
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Build label: 0.16.0
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Tue Jul 31 17:01:24 2018 (1533056484)
Build timestamp: 1533056484
Build timestamp as int: 1533056484
CMD> bazel build //mace/proto:mace_py
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Loading:
Loading: 0 packages loaded
Analyzing: target //mace/proto:mace_py (5 packages loaded)
INFO: Analysed target //mace/proto:mace_py (17 packages loaded).
INFO: Found 1 target...
[0 / 7] [-----] BazelWorkspaceStatusAction stable-status.txt
Target //mace/proto:mace_py up-to-date:
bazel-genfiles/mace/proto/mace_pb2.py
INFO: Elapsed time: 2.227s, Critical Path: 0.03s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
CMD> cp -f bazel-genfiles/mace/proto/mace_pb2.py tools/python/py_proto
CMD> bazel build //mace/proto:micro_mem_py
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Loading:
Loading: 0 packages loaded
Analyzing: target //mace/proto:micro_mem_py (6 packages loaded)
INFO: Analysed target //mace/proto:micro_mem_py (17 packages loaded).
INFO: Found 1 target...
[3 / 7] [-----] BazelWorkspaceStatusAction stable-status.txt
Target //mace/proto:micro_mem_py up-to-date:
bazel-genfiles/mace/proto/micro_mem_pb2.py
INFO: Elapsed time: 2.291s, Critical Path: 0.07s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
CMD> cp -f bazel-genfiles/mace/proto/micro_mem_pb2.py tools/python/py_proto
CMD> bazel build //third_party/caffe:caffe_py
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
Loading:
Loading: 0 packages loaded
Analyzing: target //third_party/caffe:caffe_py (5 packages loaded)
INFO: Analysed target //third_party/caffe:caffe_py (17 packages loaded).
INFO: Found 1 target...
[0 / 3] [-----] BazelWorkspaceStatusAction stable-status.txt
Target //third_party/caffe:caffe_py up-to-date:
bazel-genfiles/third_party/caffe/caffe_pb2.py
INFO: Elapsed time: 2.304s, Critical Path: 0.03s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
CMD> cp -f bazel-genfiles/third_party/caffe/caffe_pb2.py tools/python/py_proto
* Build //mace/tools:mace_run_static with ABI arm64-v8a
('build', '//mace/tools:mace_run_static', '--config', 'android', '--cpu=arm64-v8a', '--define', 'neon=true', '--define', 'opencl=true', '--define', 'quantize=false', '--define', 'bfloat16=false', '--define', 'fp16=false', '--define', 'rpcmem=false', '--define', 'hexagon=false', '--define', 'hta=false', '--define', 'apu=false', '--config', 'optimization', '--config', 'symbol_hidden', '--per_file_copt=mace/tools/mace_run.cc@-DMODEL_GRAPH_FORMAT_CODE')
('build', '//mace/tools:mace_run_static', '--config', 'android', '--cpu=arm64-v8a', '--define', 'neon=true', '--define', 'opencl=true', '--define', 'quantize=false', '--define', 'bfloat16=false', '--define', 'fp16=false', '--define', 'rpcmem=false', '--define', 'hexagon=false', '--define', 'hta=false', '--define', 'apu=false', '--config', 'optimization', '--config', 'symbol_hidden', '--per_file_copt=mace/tools/mace_run.cc@-DMODEL_GRAPH_FORMAT_CODE')
Build done!
�[95m***********************************************
Run model repvgg on TPS980P
***********************************************
�[0m
Generate input file: build/repvgg/_tmp/repvgg/53d33d936fda5df0ae8043eb18e41662/TPS980P_rk3399/arm64-v8a/model_input_input_1
Generate input file done.
* Run 'repvgg' with round=1, restart_round=1, tuning=False, out_of_range_check=False, num_threads=(-1,), cpu_affinity_policy=(1,), gpu_perf_hint=(3,), gpu_priority_hint=(3,)
Push build/repvgg/_tmp/repvgg/53d33d936fda5df0ae8043eb18e41662/TPS980P_rk3399/arm64-v8a/model_input_input_1 to /data/local/tmp/mace_run
Push third_party/nnlib/arm64-v8a/libhexagon_controller.so to /data/local/tmp/mace_run
Push build/repvgg/_tmp/arm64-v8a/mace_run_static to /data/local/tmp/mace_run
Push /tmp/cmd_file-repvgg-1615889357.2370994 to /data/local/tmp/mace_run/cmd_file-repvgg-1615889357.2370994
I mace/tools/mace_run.cc:544] model name: repvgg
I mace/tools/mace_run.cc:545] mace version: v1.0.0-0-g0945e49
I mace/tools/mace_run.cc:546] input node: input.1
I mace/tools/mace_run.cc:547] input shape: 1,3,112,112
I mace/tools/mace_run.cc:548] output node: 100
I mace/tools/mace_run.cc:549] output shape: 1,1000
I mace/tools/mace_run.cc:550] input_file: /data/local/tmp/mace_run/model_input
I mace/tools/mace_run.cc:551] output_file: /data/local/tmp/mace_run/model_out
I mace/tools/mace_run.cc:552] input dir:
I mace/tools/mace_run.cc:553] output dir:
I mace/tools/mace_run.cc:554] model_data_file:
I mace/tools/mace_run.cc:555] model_file:
I mace/tools/mace_run.cc:556] apu_cache_policy: 0
I mace/tools/mace_run.cc:557] apu_binary_file:
I mace/tools/mace_run.cc:558] apu_storage_file:
I mace/tools/mace_run.cc:559] device: CPU
I mace/tools/mace_run.cc:560] round: 1
I mace/tools/mace_run.cc:561] restart_round: 1
I mace/tools/mace_run.cc:562] gpu_perf_hint: 3
I mace/tools/mace_run.cc:563] gpu_priority_hint: 3
I mace/tools/mace_run.cc:564] num_threads: -1
I mace/tools/mace_run.cc:565] cpu_affinity_policy: 1
I mace/tools/mace_run.cc:568] limit_opencl_kernel_time: 0
I mace/tools/mace_run.cc:573] opencl_queue_window_size: 0
I mace/libmace/mace.cc:558] Creating MaceEngine, MACE version: v1.0.0-0-g0945e49
I mace/libmace/mace.cc:628] Initializing MaceEngine
I mace/libmace/mace.cc:790] Destroying MaceEngine
I mace/tools/mace_run.cc:616] restart round 0
I mace/libmace/mace.cc:558] Creating MaceEngine, MACE version: v1.0.0-0-g0945e49
I mace/libmace/mace.cc:628] Initializing MaceEngine
I mace/tools/mace_run.cc:286] Create Mace Engine latency: 90.133 ms
I mace/tools/mace_run.cc:293] Total init latency: 90.328 ms
I mace/tools/mace_run.cc:387] Warm up run
I mace/tools/mace_run.cc:423] 1st warm up run latency: 311.28 ms
I mace/tools/mace_run.cc:431] Run model
I mace/tools/mace_run.cc:493] Average latency: 122.53 ms
I mace/tools/mace_run.cc:508] Write output file /data/local/tmp/mace_run/model_out_100 with size 4000 done.
========================================================
capability(CPU) init warmup run_avg
========================================================
time 54.279 90.328 311.280 122.530
I mace/libmace/mace.cc:790] Destroying MaceEngine
Running finished!
Dana service is not available.
* Run 'repvgg' with round=1, restart_round=1, tuning=False, out_of_range_check=False, num_threads=(-1,), cpu_affinity_policy=(1,), gpu_perf_hint=(3,), gpu_priority_hint=(3,)
Push build/repvgg/_tmp/repvgg/53d33d936fda5df0ae8043eb18e41662/TPS980P_rk3399/arm64-v8a/model_input_input_1 to /data/local/tmp/mace_run
Push third_party/nnlib/arm64-v8a/libhexagon_controller.so to /data/local/tmp/mace_run
Push build/repvgg/_tmp/arm64-v8a/mace_run_static to /data/local/tmp/mace_run
Push /tmp/cmd_file-repvgg-1615889364.8023498 to /data/local/tmp/mace_run/cmd_file-repvgg-1615889364.8023498
I mace/tools/mace_run.cc:544] model name: repvgg
I mace/tools/mace_run.cc:545] mace version: v1.0.0-0-g0945e49
I mace/tools/mace_run.cc:546] input node: input.1
I mace/tools/mace_run.cc:547] input shape: 1,3,112,112
I mace/tools/mace_run.cc:548] output node: 100
I mace/tools/mace_run.cc:549] output shape: 1,1000
I mace/tools/mace_run.cc:550] input_file: /data/local/tmp/mace_run/model_input
I mace/tools/mace_run.cc:551] output_file: /data/local/tmp/mace_run/model_out
I mace/tools/mace_run.cc:552] input dir:
I mace/tools/mace_run.cc:553] output dir:
I mace/tools/mace_run.cc:554] model_data_file:
I mace/tools/mace_run.cc:555] model_file:
I mace/tools/mace_run.cc:556] apu_cache_policy: 0
I mace/tools/mace_run.cc:557] apu_binary_file:
I mace/tools/mace_run.cc:558] apu_storage_file:
I mace/tools/mace_run.cc:559] device: GPU
I mace/tools/mace_run.cc:560] round: 1
I mace/tools/mace_run.cc:561] restart_round: 1
I mace/tools/mace_run.cc:562] gpu_perf_hint: 3
I mace/tools/mace_run.cc:563] gpu_priority_hint: 3
I mace/tools/mace_run.cc:564] num_threads: -1
I mace/tools/mace_run.cc:565] cpu_affinity_policy: 1
I mace/tools/mace_run.cc:568] limit_opencl_kernel_time: 0
I mace/tools/mace_run.cc:573] opencl_queue_window_size: 0
I mace/libmace/mace.cc:558] Creating MaceEngine, MACE version: v1.0.0-0-g0945e49
I mace/libmace/mace.cc:628] Initializing MaceEngine
I mace/libmace/mace.cc:790] Destroying MaceEngine
I mace/tools/mace_run.cc:616] restart round 0
W ./mace/utils/tuner.h:201] Failed to read tuned param file: /data/local/tmp/mace_run/repvgg_tuned_opencl_parameter.TPS980P.rk3399.bin
I mace/libmace/mace.cc:558] Creating MaceEngine, MACE version: v1.0.0-0-g0945e49
W mace/core/kv_storage.cc:109] Failed to read kv store file: /data/local/tmp/mace_run/interior//mace_cl_compiled_program.bin
W mace/core/runtime/opencl/opencl_runtime.cc:442] Load OpenCL cached compiled kernel file failed. Please make sure the storage directory exist and you have Write&Read permission
I mace/libmace/mace.cc:628] Initializing MaceEngine
I mace/core/net_def_adapter.cc:385] Op ['92'](Shape) fall back to CPU
I mace/core/net_def_adapter.cc:385] Op ['94'](Gather) fall back to CPU
I mace/core/net_def_adapter.cc:385] Op ['96'](Unsqueeze) fall back to CPU
I mace/core/net_def_adapter.cc:385] Op ['98'](Concat) fall back to CPU
I mace/core/net_def_adapter.cc:385] Op ['100'](MatMul) fall back to CPU
F mace/core/memory_optimizer.cc:93] Check failed: shape.size() == 4 GPU only support 2D/4D input, op name: mace_node_98_mem_type_2, [2]
F mace/core/memory_optimizer.cc:93] backtrace:
F mace/core/memory_optimizer.cc:93] pc 0x63a5b66084
F mace/core/memory_optimizer.cc:93] pc 0x63a5b68160
F mace/core/memory_optimizer.cc:93] pc 0x63a5b68114
F mace/core/memory_optimizer.cc:93] pc 0x63a5b68310
F mace/core/memory_optimizer.cc:93] pc 0x63a5b6836c
F mace/core/memory_optimizer.cc:93] pc 0x63a5b3390c
F mace/core/memory_optimizer.cc:93] pc 0x63a5b33ba8
F mace/core/memory_optimizer.cc:93] pc 0x63a5b36d80
F mace/core/memory_optimizer.cc:93] pc 0x63a5a84088
F mace/core/memory_optimizer.cc:93] pc 0x63a5a64024
F mace/core/memory_optimizer.cc:93] pc 0x63a5a64910
F mace/core/memory_optimizer.cc:93] pc 0x63a5a67138
F mace/core/memory_optimizer.cc:93] pc 0x63a5a672d0
F mace/core/memory_optimizer.cc:93] pc 0x79cc7fc590 __libc_init
F mace/core/memory_optimizer.cc:93] pc 0x63a5a63e4c
Aborted
�[91mERROR: [Mace Run] /mace-toolkit/mace/tools/device.py:391: Mace run failed.�[0m
Additional context
Add any other context about the problem here, e.g., what you have modified about the code.
lu229 commented
@guyzsarun GPU only supports 2D/4D input, perhaps you need to modify the Concat
operator.