A Flask web app that integrates Tesseract OCR to extract text from image files..
- About
- Contributors
- Repository File Structure
- Problem Statement
- Proposed Approach
- Tools Used π§
- Project Proposal Demo
- HTML Web App Demo
- OCR In Action
- How to run the Application
- Tests
- Deployment
- License
- TODO
ββββstatic
β ββββcss
β β ββββimages
β ββββfont-awesome
β β ββββfonts
β ββββfonts
β β ββββfont-awesome-4.7.0
β β β ββββcss
β β β ββββfonts
β β β ββββless
β β β ββββscss
β β ββββLinearicons-Free-v1.0.0
β β β ββββWebFont
β β ββββmontserrat
β β ββββpoppins
β ββββforms
β ββββimg
β ββββjs
β ββββtest-img
β ββββvendor
β β ββββaos
β β ββββbootstrap
β β β ββββcss
β β β ββββjs
β β ββββbootstrap-icons
β β β ββββfonts
β β ββββboxicons
β β β ββββcss
β β β ββββfonts
β β ββββglightbox
β β β ββββcss
β β β ββββjs
β β ββββisotope-layout
β β ββββphp-email-form
β β ββββpurecounter
β β ββββswiper
β ββββvendorlog
β ββββanimate
β ββββanimsition
β β ββββcss
β β ββββjs
β ββββbootstrap
β β ββββcss
β β ββββjs
β ββββcountdowntime
β ββββcss-hamburgers
β ββββdaterangepicker
β ββββjquery
β ββββperfect-scrollbar
β ββββselect2
ββββtemplates
ββββvenv
β ββββInclude
β ββββLib
β β ββββsite-packages
β β ββββ #Files and Folders
β ββββScripts
ββββ__pycache__
In today's world, we are flooded with a vast amount of data in different forms such as images, PDFs, and scanned documents. Extracting text data from these sources can be a tedious and time-consuming task. It is essential to convert these data sources into a digital format that can be easily processed and analyzed. Manually transcribing text from images is not only time-consuming but also prone to errors, making the process unreliable. This is where Optical Character Recognition (OCR) comes into play. The goal of this project is to create a Flask web app that can integrate Tesseract OCR to extract text from image files accurately.
The proposed approach is to create a Flask web app that can accept image files, extract text using Tesseract OCR, and display the extracted text in a readable format. The user will be able to upload an image file to the web app, and the web app will process the image using Tesseract OCR. The extracted text will then be displayed on the web app, and the user will have the option to copy the extracted text file. To ensure accuracy, we will test the web app with various image file formats and compare the extracted text with the original text.
- Python
- Flask
- Flask-Scss
- HTML
- CSS
- Javascript
- Tesseract-OCR
- Pytesseract
- Rails
- Docker
Group.DAO.-.Project.Proposal.mp4
DAO.Demo.mp4
Running on Local Machine
To run the application on your local system do the following:
- Clone the repository:
git clone https://github.com/JayRalph360/DAO-OCR.git
- Change the directory:
cd DAO-OCR
- Install the requirements:
pip install -r requirements.txt
- Run the application
python -m flask run
You should be able to view the application by going to http://127.0.0.1:5000/
Running on Local Machine with Docker Compose
You can also run the application in a docker container using docker compose(if you have it installed)
- Clone the repository:
git clone https://github.com/JayRalph360/DAO-OCR.git
- Change the directory:
cd DAO-OCR
- Run the docker compose command
docker compose up -d --build
You should be able to view the application by going to http://localhost:5000/
Running in a Gitpod Cloud Environment
Click the button below to start a new development environment:
Test Flask Web App Functions
To test the Flask Web app do the following:
- Clone the repository:
git clone https://github.com/JayRalph360/DAO-OCR.git
- Change the working directory and install the requirements and pytest:
cd src && pip install -r requirements.txt && pip install pytest
- Move to the tests folder and run the tests
cd .. && cd tests && pytest
Deploying the Application to Heroku
Assuming you have git and heroku cli installed just carry out the following steps:
- Clone the repository:
git clone https://github.com/JayRalph360/DAO-OCR.git
- Change the directory:
cd DAO
- Login to Heroku
heroku login
heroku container:login
- Create your application
heroku create your-app-name
Replace your-app-name with the name of your choosing.
- Build the image and push to Container Registry:
heroku container:push web
- Then release the image to your app:
heroku container:release web
Click the button below to deploy the application.
GNU General Public License v3.0
- Research
- Development
- Deployment
- Testing
- Artcle writing
- Presentation