Welcome to the Document Scanner application! This tool is designed to quickly extract information from various sources, such as business cards and letters. It was created using the following scientific approaches:
- Computer Vision
- Natural Language Processing
The application is built with the following sequencing:
- Load image
- Preprocess with Computer Vision
- Extract text blocks with OCR
- Perform NER with NLP
The following Python libraries are used in the Computer Vision module:
- OpenCV
- Numpy
- Pytesseract
And in the Natural Language Processing module:
- SpaCy
- Pandas
- Regular Expression
- String
The project involves six stages:
Stage 1: Setup
- Install Python
- Install Dependencies
- Create VEnv
Stage 2: Data Preparation
- Gather Images
- Overview on Pytesseract
- Extract Text from all Images
- Clean and Prepare text
Stage 3: Labeling NER Data
- Manually Labeling with BIO technique
- B - Beginning
- I - Inside
- O - Outside
Stage 4: Splitting to Train/Test Sets and Converting to SpaCy Format
- Prepare Training Data for Spacy
- Convert data into spacy format
Stage 5: Train NER Model
- Configure NER Model
- Train the model
Stage 6: Perform with Image Preprocessing, Text Extraction, Data Parsing, and Entities Predictions
- Load Model
- Render and Serve with Displacy
- Draw Bounding Box on Image
- Parse Entities from Text
Finally, all these stages were gathered together to create a document scanner application.