airockchip/rknn-toolkit2

连板精度分析报错 exportDataSize large then model size

Opened this issue · 2 comments

转换脚本:

#!/usr/bin/env python
# coding: utf-8

from rknn.api import RKNN
from sys import exit
rknn = RKNN(verbose=True)

ONNX_MODEL="RWKV-x060-World-1B6-v2.1-20240328-ctx4096.onnx"
RKNN_MODEL=ONNX_MODEL.replace(".onnx",".rknn")
DATASET="dataset.txt"
QUANTIZE=False

batch_size = 1

# pre-process config
print('--> Config model')
rknn.config(quantized_algorithm='normal', quantized_method='channel', target_platform='rk3588', optimization_level=3)
print('done')

# Load ONNX model
print('--> Loading model')
ret = rknn.load_onnx(model=ONNX_MODEL, inputs=
                     ['/emb/Gather_output_0',
                      'input_state',
                      'scale_ratio'],
                      input_size_list=[
                          [batch_size, 2048],
                          [batch_size, 1584, 2048],
                          [1]
                        ])
if ret != 0:
    print('Load model failed!')
    exit(ret)
print('done')

# Build model
print('--> Building model')
ret = rknn.build(do_quantization=QUANTIZE, dataset=DATASET, rknn_batch_size=None)
if ret != 0:
    print('Build model failed!')
    exit(ret)
print('done')

#export
print('--> Export RKNN model')
ret = rknn.export_rknn(RKNN_MODEL)
if ret != 0:
    print('Export RKNN model failed!')
    exit(ret)
print('done')


# Evaluate model
rknn.init_runtime(target='rk3588')
rknn.accuracy_analysis(inputs=['../embeddings.npy','../state.npy','../scale_ratio.npy'], target='rk3588')

pc端日志输出:

D RKNN: [05:03:26.386] ----------------------------------------
D RKNN: [05:03:26.386] Total Internal Memory Size: 19090.1KB
D RKNN: [05:03:26.386] Total Weight Memory Size: 2.96318e+06KB
D RKNN: [05:03:26.386] ----------------------------------------
D RKNN: [05:03:26.410] <<<<<<<< end: rknn::RKNNMemStatisticsPass
I rknn buiding done.
done
--> Export RKNN model
D adb path: /usr/bin/adb
adb: unable to connect for root: closed
I target set by user is: rk3588
D adb path: /usr/bin/adb
I Get hardware info: target_platform = rk3588, os = Linux, aarch = aarch64
I Check RK3588 board npu runtime version
I Starting ntp or adb, target is RK3588
I Start adb...
I Connect to Device success!
I NPUTransfer(1144154): Starting NPU Transfer Client, Transfer version 2.2.2 (@2024-06-18T03:50:20)
D NPUTransfer(1144154): Transfer spec = local:transfer_proxy
D NPUTransfer(1144154): Transfer interface successfully opened, fd = 15
I NPUTransfer(1144154): TransferBuffer: min aligned size: 1024
D RKNNAPI: ==============================================
D RKNNAPI: RKNN VERSION:
D RKNNAPI:   API: 2.1.0 (6405676 build@2024-08-03T06:59:33)
D RKNNAPI:   DRV: rknn_server: 2.1.0 (6405676 build@2024-08-03T14:59:00)
D RKNNAPI:   DRV: rknnrt: 2.1.0 (967d001cc8@2024-08-07T19:28:19)
D RKNNAPI: ==============================================
E RKNNAPI: rknn_init,  msg_load_ack fail, ack = 1(ACK_FAIL), expect 0(ACK_SUCC)!
D NPUTransfer(1144154): Transfer client closed, fd = 15
E init_runtime: Traceback (most recent call last):
  File "rknn/api/rknn_log.py", line 309, in rknn.api.rknn_log.error_catch_decorator.error_catch_wrapper
  File "rknn/api/rknn_base.py", line 2422, in rknn.api.rknn_base.RKNNBase.init_runtime
  File "rknn/api/rknn_runtime.py", line 396, in rknn.api.rknn_runtime.RKNNRuntime.build_graph
Exception: RKNN init failed. error code: RKNN_ERR_MODEL_INVALID

W init_runtime: ===================== WARN(12) =====================
E rknn-toolkit2 version: 2.1.0+708089d1
Traceback (most recent call last):
  File "rknn/api/rknn_log.py", line 309, in rknn.api.rknn_log.error_catch_decorator.error_catch_wrapper
  File "rknn/api/rknn_base.py", line 2422, in rknn.api.rknn_base.RKNNBase.init_runtime
  File "rknn/api/rknn_runtime.py", line 396, in rknn.api.rknn_runtime.RKNNRuntime.build_graph
Exception: RKNN init failed. error code: RKNN_ERR_MODEL_INVALID

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./convert_rknn.py", line 56, in <module>
    rknn.init_runtime(target='rk3588')
  File "/home/zt/.conda/envs/rknn/lib/python3.8/site-packages/rknn/api/rknn.py", line 295, in init_runtime
    return self.rknn_base.init_runtime(target=target, device_id=device_id,
  File "rknn/api/rknn_log.py", line 314, in rknn.api.rknn_log.error_catch_decorator.error_catch_wrapper
  File "rknn/api/rknn_log.py", line 95, in rknn.api.rknn_log.RKNNLog.e
ValueError: Traceback (most recent call last):
  File "rknn/api/rknn_log.py", line 309, in rknn.api.rknn_log.error_catch_decorator.error_catch_wrapper
  File "rknn/api/rknn_base.py", line 2422, in rknn.api.rknn_base.RKNNBase.init_runtime
  File "rknn/api/rknn_runtime.py", line 396, in rknn.api.rknn_runtime.RKNNRuntime.build_graph
Exception: RKNN init failed. error code: RKNN_ERR_MODEL_INVALID

板端日志输出:

start rknn server, version:2.1.0 (6405676 build@2024-08-03T14:59:00)
I NPUTransfer(1746279): Starting NPU Transfer Server, Transfer version 2.2.2 (@2024-06-18T03:50:51)
E RKNN: [05:05:19.157] parseRKNN: exportDataSize large then model size: 3047582144 vs 1189679104!
E RKNN: [05:05:19.157] parseRKNN from buffer: Invalid RKNN format!
E RKNN: [05:05:19.157] rknn_init, load model failed!
1161502 SERVER init(190): rknn_init fail! ret=-6
1161502 SERVER process_msg_init(384): Client 0 init model fail!
  • 测试版本为2.1.0。2.2.0版本的toolkit2在转换模型时就会报错,无法测试
  • 输出的RKNN模型在板子上是可以正常运行的。

问题解决,原因是/tmp满了导致模型不完整

问题解决,原因是/tmp满了导致模型不完整

问题解决,原因是/tmp满了导致模型不完整

那你是怎么解决的,扩充tmp的容量嘛