/RapidOCR

Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVION and PaddlePaddle.

Primary LanguagePythonApache License 2.0Apache-2.0

Shows an illustrated sun in light mode and a moon with stars in dark mode.
 
Open source OCR for the security of the digital world
 

Open in Colab PyPI SemVer2.0

简体中文 | English

Introduction

  • The fastest running, most widely supported, completely open source and free multi-platform, multi-language OCR known to support rapid offline deployment.
  • Supported Languages: The default is Chinese and English, other language recognition requires self-service conversion. For specific reference here.
  • Cause: PaddleOCR is not well engineered, and to make it easier for people to do OCR inference on various ends, we converted the model in PaddleOCR to ONNX format and ported it to various platforms using Python/C++/Java/C#.
  • Name Source: Light, fast, economical and smart. OCR technology based on deep learning technology focuses on artificial intelligence advantages and small models, with speed as the mission and effect as the leading role.
  • Usage:
    • If the existing model in the repo meets the requirements → RapidOCR deployment can be used.
    • Not meeting requirements → Based on PaddleOCR. Fine-tune your own data → RapidOCR deployment. -If this repo is helpful to you, please click on a small star ⭐ Bah!
Demo

Installation

pip install rapidocr_onnxruntime

Usage

rapidocr_onnxruntime -img 1.jpg

Related projects overview

In the table below, except for the Evaluation Collection part which is hosted under the Hugging Face Community, the rest are all under Github. The details are as follows:

The first line is the function introduction.

The second line is the corresponding warehouse name. You can directly search for the name on Github.

Documentation

Full documentation can be found on docs, in Chinese.

Acknowledgements

  • Many thanks to DeliciaLaniD for fixing the misplaced start position of scan animation in ocrweb.
  • Many thanks to zhsunlight for the suggestion about parameterized call GPU reasoning and the careful and thoughtful testing.
  • Many thanks to lzh111222334 for fixing some bugs of rec preprocessing under python version.
  • Many thanks to AutumnSun1996 for the suggestion in the #42.
  • Many thanks to DeadWood8 for providing the document which packages rapidocr_web to exe by Nuitka.
  • Many thanks to Loovelj for fixing the bug of sorting the text boxes. For details see issue 75.

Code Contributors

Contributing

  • Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
  • Please make sure to update tests as appropriate.

Important

If you want to sponsor the project, you can directly click the Buy me a coffee image, please write a note (e.g. your github account name) to facilitate adding to the sponsorship list below.

Sponsor Applied Products
-

Citation

If you find this project useful in your research, please consider cite:

@misc{RapidOCR 2021,
    title={{Rapid OCR}: OCR Toolbox},
    author={RapidAI Team},
    howpublished = {\url{https://github.com/RapidAI/RapidOCR}},
    year={2021}
}

Stargazers over time

Stargazers over time

License

The copyright of the OCR model is held by Baidu, while the copyrights of all other engineering scripts are retained by the repository's owner.

This project is released under the Apache 2.0 license.