/STR_Vietnam_Temple

Code OCR for Vietnam temple like Google Lens

Primary LanguagePython

Try Demo on our website

What's new

  • 11 November 2023 - Version 0.5

    • New method to get correct background color and foreground color for image.
    • Added support for predict all image in folder.
  • 27 September 2023 - Version 0.4

    • Rotate the text according to the rotation angle of the bbox.
    • Change the method for determining background and foreground colors.
    • Add time to inference.
    • Add code to run on Google Colab
    • Report
    • Web
    • Output
  • 11 September 2023 - Version 0.3

    • Skip Latin characters.
    • Expand bounding box following by polygons instead in rectangles
  • 22 August 2023 - Version 0.2

    • Integrate OCR into the pipeline
    • Recognition for vertical texts.
    • Postprocessing for vertical texts.
  • 11 August 2023 - Version 0.1

    • Code fullflow for STR Chinese Temples.
    • Code backend and fronents for demo.
    • Code postprocessing for final output.

What's coming next

  • Write log
  • Add debug code
  • Remove text
  • Classify style -> Define vietnamese's font
  • Mapping color.
  • What is real image in Vietnamese? -> Increment height of text
  • Collect data temple in vietnamese.
  • Train more accurate model detection and recognition text
  • Collect and label datasets.
  • Host on server have GPU.
  • Add XAI support
  • Add illustration video like 3Blue1Brown
  • Write thesis
  • Write paper
  • Multimodel for STR

Todo

  • debug code remove background
  • the size of text increases when rotating
  • the boundaries of boxes have another color ?????

GUI

  • Format log file
  • Delete unnecessary information

Backend

  1. Host on server

Input and output

Input

Input

Output

output

Installation

Install using pip

pip install -r requirements.txt

To use PP-OCR method

  1. Download wheel PyMuPDF from GG Drive
  2. Install PyMuPDF
pip install PyMuPDF-1.20.2-cp311-cp311-win_amd64.whl
  1. Install PaddleOCR
pip install paddleocr

Install in Azure server

  1. Install paddlepaddle
  • Without GPU

sh python -m pip install paddlepaddle -i https://pypi.tuna.tsinghua.edu.cn/simple

  1. Install paddleocr sh pip install "paddleocr>=2.0.1" --upgrade PyMuPDF==1.21.1 Thanks: https://stackoverflow.com/questions/76379293/how-can-i-fix-the-error-in-pymupdf-when-installing-paddleocr-with-pip

  2. Install sklearn sh pip install scikit-learn

  3. Update libgomp1 sh apt-get install libgomp1 Thanks: https://stackoverflow.com/questions/43764624/importerror-libgomp-so-1-cannot-open-shared-object-file-no-such-file-or-direc

  4. Update ffmpeg libsm6 libxext6 library sh apt-get update && apt-get install ffmpeg libsm6 libxext6 -y Thanks: https://stackoverflow.com/questions/55313610/importerror-libgl-so-1-cannot-open-shared-object-file-no-such-file-or-directo

Run code

Change input and output path in run.sh. Then run this command:

sh run.sh

Run demo

Run backend

flask run

Run frontend

Double click frontend.html to run the frontend. GUI

Log

We write log in folder log. Each log file will contain the log message every day.

Fullflow diagram

fullflow diagram

Postprocessing

Postprocess diagram

Preprocessing

OCR

Text Detection

Text Recognition

Dataset

Synthesis

Reality

Label

Language Model

Translate from accent Vietnamese to modern Vietnamese

Documentation

References