🔄 PaddleOCR Model Convert

Introduction

This repository is mainly to convert Inference Model in PaddleOCR into ONNX format.
Input: url or local tar path of inference model
Output: converted ONNX model
If it is a recognition model, you need to provide the original txt path of the corresponding dictionary (Open the txt file in github, click the path after raw in the upper right corner, similar to this), used to write the dictionary into the ONNX model
☆ It needs to be used with the relevant reasoning code in RapidOCR
If you encounter a model that cannot be successfully converted, you can check which steps are wrong one by one according to the ideas in the figure below.

Overall framework

flowchart TD

A([PaddleOCR inference model]) --paddle2onnx--> B([ONNX])
B --> C([Change Dynamic Input]) --> D([Rec: save the character dict to onnx])
D --> E([Save])

Installation

pip install paddleocr_convert

Usage

Warning

Only support the reasoning model in the download address in link, if it is a training model, Manual conversion to inference format is required.

The slim quantized model in PaddleOCR does not support conversion.

Using the command line

Usage:

$ paddleocr_convert -h
usage: paddleocr_convert [-h] [-p MODEL_PATH] [-o SAVE_DIR]
                        [-txt_path TXT_PATH]

optional arguments:
-h, --help show this help message and exit
-p MODEL_PATH, --model_path MODEL_PATH
                        The inference model url or local path of paddleocr.
                        e.g. https://paddleocr.bj.bcebos.com/PP-
                        OCRv3/chinese/ch_PP-OCRv3_det_infer.tar or
                        models/ch_PP-OCRv3_det_infer.tar
-o SAVE_DIR, --save_dir SAVE_DIR
                        The directory of saving the model.
-txt_path TXT_PATH, --txt_path TXT_PATH
                        The raw txt url or local txt path, if the model is
                        recognition model.

Example:

#online
$ paddleocr_convert -p https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar \
                    -o models

$ paddleocr_convert -p https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar\
                    -o models\
                    -txt_path https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.6/ppocr/utils/ppocr_keys_v1.txt

# offline
$ paddleocr_convert -p models/ch_PP-OCRv3_det_infer.tar\
                    -o models

$ paddleocr_convert -p models/ch_PP-OCRv3_rec_infer.tar\
                    -o models\
                    -txt_path models/ppocr_keys_v1.txt

Script use

online mode

from paddleocr_convert import PaddleOCRModelConvert

converter = PaddleOCRModelConvert()
save_dir = 'models'
url = 'https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar'
txt_url = 'https://raw.githubusercontent.com/PaddlePaddle/PaddleOCR/release/2.6/ppocr/utils/ppocr_keys_v1.txt'

converter(url, save_dir, txt_path=txt_url)

offline mode

from paddleocr_convert import PaddleOCRModelConvert

converter = PaddleOCRModelConvert()
save_dir = 'models'
model_path = 'models/ch_PP-OCRv3_rec_infer.tar'
txt_path = 'models/ppocr_keys_v1.txt'
converter(model_path, save_dir, txt_path=txt_path)

Use the model

Assuming that the model needs to be recognized in Japanese, and it has been converted, the path is local/models/japan.onnx

Install rapidocr_onnxruntime library
```
pip install rapidocr_onnxruntime
```

Script use

from rapidocr_onnxruntime import RapidOCR

model_path = 'local/models/japan.onnx'
engine = RapidOCR(rec_model_path=model_path)

img = '1.jpg'
result, elapse = engine(img)

CLI use

rapidocr_onnxruntime -img 1.jpg --rec_model_path local/models/japan.onnx

Changelog

Click to expand

2023-09-22 v0.0.17 update:
- Improve the log when meets the error.
2023-07-27 v0.0.16 update:
- Added the online conversion version of ModelScope.
- Change python version from python 3.6 ~ 3.11.
2023-04-13 update:
- Add online conversion program link
2023-03-05 v0.0.4~7 update:
- Support transliteration of local models and dictionaries
- Optimize internal logic and error feedback
2023-02-28 v0.0.3 update:
- Added setting to automatically change to dynamic input for models that are not dynamic input
2023-02-27 v0.0.2 update:
- Encapsulate the conversion model code into a package, which is convenient for self-help model conversion
2022-08-15 v0.0.1 update:
- Write the dictionary of the recognition model into the meta in the onnx model for subsequent distribution.

RapidAI/PaddleOCRModelConvert

🔄 PaddleOCR Model Convert

Introduction

Overall framework

Installation

Usage

Using the command line

Script use

Use the model

Changelog