Choose your preferred language:

🤖 Telegram OCRBot for Bank Receipts

This repository contains the source code for the Telegram OCRBot, a powerful bot that can extract text from bank receipt images shared on Telegram. The bot is designed to process receipts from multiple banks, primarily in the Thai language, and respond to the customers on the Telegram app.

The main goal of this project is to provide a fast, accurate, and cost-effective OCR solution that can handle inconsistencies in image sizes and varying types of receipts.

How to

nbs = notebooks
src = source codes
app = app for Dockerize

👓 Overview

The project has undergone several iterations to achieve the best possible results. The current approach is to use Google Vision for OCR, which supports the Thai language, and Regular Expression for extracting relevant information from the OCR results. This method is cost-effective, fast, and offers reasonable accuracy.

🛣️ Features

Supports multiple image formats (JPEG, PNG, etc.)
Processes receipts from multiple banks (kbank, scb, ktb)
Handles various text orientations and sizes
Provides accurate OCR results using Google Vision
Uses Regular Expression for extracting relevant information
Fast and cost-effective solution

📕 Languages and Tools

Python
OpenCV
Tesseract OCR
Pytesseract
HuggingFace
Google Vision API
Regular Expressions
Docker

🔄 Approaches

Approach 1: OpenCV + Pytesseract

In this approach, OpenCV was used to find the coordinates of the text regions within the image. Once the coordinates were determined, the image was cropped, and Pytesseract was used to perform OCR on the cropped sections. By specifying the language and the areas to crop, the OCR process yielded good accuracy.

Approach 2: Donut by HuggingFace

Donut is a deep learning model developed by HuggingFace. This approach aimed to leverage the power of deep learning to improve OCR accuracy. The Donut model was tested on the SROIE dataset, and it produced excellent results. However, implementing the model required knowledge of PyTorch and HuggingFace, and the model did not support the Thai language.

Approach 3: LayoutLM by HuggingFace

LayoutLM is another deep learning model developed by HuggingFace. This approach aimed to achieve high accuracy by training the model on a large dataset of labeled images. However, the model was difficult to implement and required a considerable amount of time to train. Additionally, cost optimization and deployment presented challenges.

Approach 4: Google Vision + Regular Expression (Current)

The current approach involves using the Google Vision API to perform OCR on the entire receipt image. The API supports the Thai language and provides a good level of accuracy. After OCR, Regular Expression is used to extract relevant information from the OCR results. This approach is simple, fast, and cost-effective.

Approaches Comparison

Approach	Description	Pros	Cons
1	OpenCV + Pytesseract	- Good accuracy when specifying language and area to crop and OCR	- Inconsistencies in image size and receipt types require multiple parameters
2	Donut by HuggingFace	- Excellent results on SROIE dataset	- Requires knowledge of PyTorch and HuggingFace - Does not support Thai language
3	LayoutLM by HuggingFace	- High accuracy with sufficient labeled data	- Complex implementation - Time-consuming to train - Cost optimization and deployment challenges
4 (Current)	Google Vision + Regular Expression	- Fast and cost-effective - Supports Thai language - Simple but effective	- Accuracy may suffer in certain cases (e.g., multi-line text)

Changelog

1.5 (2023-23-22)

Add transaction id extractor
Refractor for more OOP
Add comment and documentation

1.4 (2023-05-05)

Change new credential
Add more security to prevent data leak

1.3 (2023-05-04)

Fix and improve CloudWatch log
Add new function to extract name of the bank
Add support for additional banks (gsb, ktb)
Optimize performance and reduce latency
Update documentations
Refractor code

1.1 (2023-04-21)

Add new functions to extract account number
Add more features based on users feedback
Change chat dialogue to make it more user-friendly
Fixing minor bugs

1.0 (2023-04-20)

Initial release
Support 2 banks (Kbank, scb)

northpr/telegram_ocr_bot