This is a web application that extracts texts from images.
This product will help you get the texts from an image. What you have to do is only upload your image, select the models, click the submit button, and see the magic! On the other hand, it also supports converting from pdf to images, which would be useful for whom having pdf file(s).
In fact, this web application takes advantages from open-source projects (PaddleOCR and mmocr).
In the recent years, numerous OCR models was invented by researchers. However, each of these models has its own strengh and weeknesses. As a result, different images (scene image and document image for example) need to use different models to achieve the best result. Image OCR Project was created based on that idea.
The workflow of the Image OCR Project feature works as follows: when a user submits an image containing text, the application sends it to the OCR models (both text detection model and text recognition model), which processes it using PaddleOCR and/or mmocr. These engines then analyze the image and extract the text content from it. Finally, the extracted text is sent back to the user in textarea, allowing user for editing and copying.
Language support: English.
PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the corresponding downstream tasks including key information extraction.
-
Detect text with following model: PaddleOCR, MaskCRNN, DRBG, FCENet, PANet_CTW, dbnetpp. You can also add some models if interested by modify the
models_list.py
file. -
Recognize text with following model: PaddleOCR, ABINet_Vision, ASTER, CRNN, MASTER, svtr-small. You can also add some models if interested by modify the
models_list.py
file. -
Convert pdf to images. Images will be grouped in a .zip files that user can download it.
There are also some other fun features like make contact or purchase the VIP plan, but it's on the developing. I will update it in the future.
These steps below will help you install and run this web application. Beside pip and python, you should have Anaconda installed to separate environments and avoid future errors.
Notice: Steps may vary on different OS-es and computers. There may be minor problems when following, but the general idea is remain the same
Step 1: Create new conda env: conda create --name YOUR_ENV_NAME
Step 2: Clone this repo: git clone https://github.com/hoangbros03/ImageOCR.git
Step 3: Move to repo folder: cd ImageOCR
Step 4: Install all packages in requirements.txt file: pip install -r requirements.txt
Step 5: Make 'models' folder. This will be used to store paddleOCR and mmocr repos
Step 6: Move to models folder and clone these 2 repos:
cd models
git clone https://github.com/PaddlePaddle/PaddleOCR.git
git clone https://github.com/open-mmlab/mmocr.git
Step 7: Follow the instruction to install necessary packages for 2 repos. Please review the docummentation associated with these repos for details.
Step 8: Back to the main repo folder and run app.py: python app.py
Step 9: Go to localhost:8000
. Enjoying <3
Step 1: Upload image you want to extract the text. Remember that only image is accepted.
Step 2: Choose text detection model (default is PaddleOCR)
Step 3: Choose text recognition model (default is PaddleOCR)
Step 4: Click submit
Step 1: Upload pdf file you want to extract the images. Remember that only .pdf file is accepted.
Step 3: Click submit
Step 3: Click download to download the .zip file
There are resources that helps me a lot to complete this project: