Installation

  1. Install paddlepaddle
  • Without GPU
python -m pip install paddlepaddle -i https://pypi.tuna.tsinghua.edu.cn/simple
  • With GPU
python -m pip install paddlepaddle-gpu -i https://pypi.tuna.tsinghua.edu.cn/simple

SRC: https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_en/quickstart_en.md

  1. Install paddleocr
pip install "paddleocr>=2.0.1" --upgrade PyMuPDF==1.21.1

Thanks: https://stackoverflow.com/questions/76379293/how-can-i-fix-the-error-in-pymupdf-when-installing-paddleocr-with-pip

  1. Install sklearn
pip install scikit-learn
  1. Update libgomp1
apt-get install libgomp1

Thanks: https://stackoverflow.com/questions/43764624/importerror-libgomp-so-1-cannot-open-shared-object-file-no-such-file-or-direc

  1. Update ffmpeg libsm6 libxext6 library
apt-get update && apt-get install ffmpeg libsm6 libxext6  -y

Thanks: https://stackoverflow.com/questions/55313610/importerror-libgl-so-1-cannot-open-shared-object-file-no-such-file-or-directo

Experiment

Setup tool

pip install -e .

Blendtext with random backgorund

source D:/Master/OCR_Nom/deploy/azure/str_vietnam_temple/.venv/Scripts/activate
python my_postprocess/blend_text.py

Blendtext with bbox

source D:/Master/OCR_Nom/deploy/azure/str_vietnam_temple/.venv/Scripts/activate
python my_postprocess/blend_text_with_bbox.py

Post-process with fill color

python my_postprocess/_postprocess.py

Post-process with remove text

big lama

source D:/Master/OCR_Nom/deploy/azure/str_vietnam_temple/.venv/Scripts/activate
python my_postprocess/_postprocess_remove_bg.py