/rust-paddle-ocr

高性能OCR库,有PaddleOCR v4/v5模型。支持文本检测/识别、多语言(中文/英文/日文)。 提供Rust库 + CAPI动态库 + CLI工具,轻松集成 调用简单 开箱即用。 High-performance OCR library, supports multiple languages ​​(Chinese/English/Japanese), provides Rust crate + C API + CLI tools.

Primary LanguageRustApache License 2.0Apache-2.0

Rust PaddleOCR

English | 中文 | 日本語 | 한국어

A lightweight and efficient OCR (Optical Character Recognition) library implemented in Rust, based on the PaddleOCR models. This library leverages the MNN inference framework to provide high-performance text detection and recognition capabilities.

License

Features

  • Text Detection: Accurately locate text regions in images
  • Text Recognition: Recognize text content from the detected regions
  • Multi-Version Model Support: Support for both PP-OCRv4 and PP-OCRv5 models with flexible selection
  • Multi-Language Support: PP-OCRv5 supports Simplified Chinese, Traditional Chinese, English, Japanese, Chinese Pinyin and more
  • Complex Scene Recognition: Enhanced handwriting, vertical text, and rare character recognition capabilities
  • High Performance: Optimized with the MNN inference framework
  • Minimal Dependencies: Lightweight and easy to integrate
  • Customizable: Adjustable parameters for different use cases
  • Command-line Tool: Simple command-line interface for OCR recognition

Model Versions

This library supports three PaddleOCR model versions:

PP-OCRv4

  • Stable Version: Well-tested and highly compatible
  • Use Cases: Regular document recognition, scenarios requiring high accuracy
  • Model Files:
    • Detection model: ch_PP-OCRv4_det_infer.mnn
    • Recognition model: ch_PP-OCRv4_rec_infer.mnn
    • Character set: ppocr_keys_v4.txt

PP-OCRv5 ⭐️ Recommended

  • Latest Version: Next-generation text recognition solution
  • Multi-Script Support: Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, Japanese
  • Enhanced Scene Recognition:
    • Significantly improved Chinese-English complex handwriting recognition
    • Optimized vertical text recognition
    • Enhanced rare character recognition capabilities
  • Performance Improvement: 13% end-to-end improvement compared to PP-OCRv4
  • Model Files:
    • Detection model: PP-OCRv5_mobile_det.mnn
    • Recognition model: PP-OCRv5_mobile_rec.mnn
    • Character set: ppocr_keys_v5.txt

PP-OCRv5 FP16 ⭐️ New

  • Efficient Version: Provides faster inference speed and lower memory usage without sacrificing accuracy
  • Use Cases: Scenarios requiring high performance and low memory usage
  • Performance Improvements:
    • Inference speed increased by ~9%
    • Memory usage reduced by ~8%
    • Model size halved
  • Model Files:
    • Detection model: PP-OCRv5_mobile_det_fp16.mnn
    • Recognition model: PP-OCRv5_mobile_rec_fp16.mnn
    • Character set: ppocr_keys_v5.txt

Model Performance Comparison

Feature PP-OCRv4 PP-OCRv5 PP-OCRv5 FP16
Script Support Chinese, English Simplified Chinese, Traditional Chinese, English, Japanese, Chinese Pinyin Simplified Chinese, Traditional Chinese, English, Japanese, Chinese Pinyin
Handwriting Support Basic Enhanced Enhanced
Vertical Text Basic Optimized Optimized
Rare Character Limited Enhanced Enhanced
Inference Speed (FPS) 1.1 1.2 1.2
Memory Usage (Peak) 422.22MB 388.41MB 388.41MB
Model Size Standard Standard Halved
Recommended Use Case Regular Documents Complex Scenarios High-Performance Scenarios

PP-OCRv5 FP16 Test Data

Standard Model

============================================================
Test Report: Inference Speed Test
============================================================
Total Time: 44.23s
Average Inference Time: 884.64ms
Fastest Inference Time: 744.99ms
Slowest Inference Time: 954.03ms
Success Count: 50
Failure Count: 0
Inference Speed: 1.1 FPS
Memory Usage - Start: 14.94MB
Memory Usage - End: 422.22MB
Memory Usage - Peak: 422.22MB
Memory Change: +407.28MB

FP16 Model

============================================================
Test Report: Inference Speed Test
============================================================
Total Time: 43.33s
Average Inference Time: 866.66ms
Fastest Inference Time: 719.41ms
Slowest Inference Time: 974.93ms
Success Count: 50
Failure Count: 0
Inference Speed: 1.2 FPS
Memory Usage - Start: 15.70MB
Memory Usage - End: 388.41MB
Memory Usage - Peak: 388.41MB
Memory Change: +372.70MB

Testing Method

Run the following command to execute the test and verify performance data (based on Mac Mini M4):

python test_ffi.py test

Installation

Add this to your Cargo.toml:

[dependencies.rust-paddle-ocr]
git = "https://github.com/zibo-chen/rust-paddle-ocr.git"

You can also specify a particular branch or tag:

[dependencies.rust-paddle-ocr]
git = "https://github.com/zibo-chen/rust-paddle-ocr.git"
branch = "main"

Prerequisites

This library requires:

  • Pre-trained PaddleOCR models converted to MNN format
  • Character set file for text recognition

Usage

As a Rust Library

You can flexibly choose between PP-OCRv4 or PP-OCRv5 models by simply loading different model files:

use rust_paddle_ocr::{Det, Rec};
use image::open;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // === Using PP-OCRv5 models (Recommended) ===
    let mut det = Det::from_file("./models/PP-OCRv5_mobile_det.mnn")?;
    let mut rec = Rec::from_file(
        "./models/PP-OCRv5_mobile_rec.mnn", 
        "./models/ppocr_keys_v5.txt"
    )?;
    
    // === Or using PP-OCRv4 models ===
    // let mut det = Det::from_file("./models/ch_PP-OCRv4_det_infer.mnn")?;
    // let mut rec = Rec::from_file(
    //     "./models/ch_PP-OCRv4_rec_infer.mnn", 
    //     "./models/ppocr_keys_v4.txt"
    // )?;
    
    // Customize detection parameters (optional)
    let det = det
        .with_rect_border_size(12)  // Recommended for PP-OCRv5
        .with_merge_boxes(false)    // Recommended for PP-OCRv5
        .with_merge_threshold(1);   // Recommended for PP-OCRv5
    
    // Customize recognition parameters (optional)
    let rec = rec
        .with_min_score(0.6)
        .with_punct_min_score(0.1);
    
    // Open an image
    let img = open("path/to/image.jpg")?;
    
    // Detect text regions
    let text_images = det.find_text_img(&img)?;
    
    // Recognize text in each detected region
    for text_img in text_images {
        let text = rec.predict_str(&text_img)?;
        println!("Recognized text: {}", text);
    }
    
    Ok(())
}

Command-line Tool

This library provides a built-in command-line tool for direct OCR recognition:

# Basic usage
./ocr -p path/to/image.jpg

# Output in JSON format (with detailed information and positions)
./ocr -p path/to/image.jpg -m json

# Show verbose log information
./ocr -p path/to/image.jpg -v

# Show current model version information
./ocr --version-info

Building Different Versions

# Build with PP-OCRv4 models (default)
cargo build --release

# Build with PP-OCRv5 models (recommended)
cargo build --release --features v5

Command-line Options

Options:
  -p, --path <IMAGE_PATH>  Path to the image for recognition
  -m, --mode <MODE>        Output mode: json(detailed) or text(simple) [default: text]
  -v, --verbose            Show verbose log information
      --version-info       Show model version information
  -h, --help               Print help information
  -V, --version            Print version information

Model Files

You can obtain pre-trained MNN models from the following sources:

  1. Official Models: Download from PaddleOCR official repository and convert to MNN format
  2. Project Provided: The models/ directory in this project contains pre-converted model files

PP-OCRv5 vs PP-OCRv4 Performance Comparison

Feature PP-OCRv4 PP-OCRv5
Script Support Chinese, English Simplified Chinese, Traditional Chinese, English, Japanese, Chinese Pinyin
Handwriting Recognition Basic support Significantly enhanced
Vertical Text Basic support Optimized improvement
Rare Character Recognition Limited support Enhanced recognition
End-to-End Accuracy Baseline 13% improvement
Recommended Scenarios Regular documents Complex and diverse scenarios

API Reference

Text Detection (Det)

// Create a new detector
let mut det = Det::from_file("path/to/det_model.mnn")?;

// Find text regions and return rectangles
let rects = det.find_text_rect(&img)?;

// Find text regions and return cropped images
let text_images = det.find_text_img(&img)?;

// Customization options
let det = det
    .with_rect_border_size(12)
    .with_merge_boxes(false)
    .with_merge_threshold(1);

Text Recognition (Rec)

// Create a new recognizer
let mut rec = Rec::from_file("path/to/rec_model.mnn", "path/to/keys.txt")?;

// Recognize text and return a string
let text = rec.predict_str(&text_img)?;

// Recognize text and return characters with confidence scores
let char_scores = rec.predict_char_score(&text_img)?;

// Customization options
let rec = rec
    .with_min_score(0.6)
    .with_punct_min_score(0.1);

Performance Optimization

This library includes several optimizations:

  • Efficient tensor management
  • Smart box merging for text detection
  • Adaptive image preprocessing
  • Configurable confidence thresholds

Example Results

Here are some example results showing the library in action:

Example 1

Original Image 1 OCR Result 1

Example 2

Original Image 2 OCR Result 2

Example 3

Original Image 3 OCR Result 3

License

This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.

Acknowledgements

  • PaddleOCR - For the original OCR models and research
  • MNN - For the efficient neural network inference framework
  • mnn-rs - For providing Rust bindings to MNN