Optical Character Recognition (OCR) in Python

Overview

This repository contains Python code for Optical Character Recognition (OCR), allowing the extraction of text content from images and other graphical representations.

Features

Image preprocessing to enhance OCR accuracy.
Text extraction from various image formats (e.g., PNG, JPEG, TIFF).
Language support for multiple languages.
Detailed error handling and logging for improved stability.

Installation
Usage
Configuration
Contributing
License

Installation

Tesseract Installation

Mac

Install Tesseract using Homebrew:
```
brew install tesseract
```

Windows

Download the Tesseract installer for Windows from the official GitHub repository.
Run the installer and follow the installation instructions.
Add Tesseract to the system PATH:
- Open the Start menu and search for "Environment Variables".
- Click on "Edit the system environment variables".
- Click the "Environment Variables" button.
- Under "System variables", scroll down and select "Path", then click "Edit".
- Click "New" and add the Tesseract installation path (e.g., C:\Program Files\Tesseract-OCR).
Verify the installation in Command Prompt:

Prerequisites

Python 3.6+
Install dependencies using pip:
```
pip install -r requirements.txt
```

Contribution

If you would like to contribute to this project, please follow these steps:

Fork the repository.
Create a new branch for your feature or improvement.
Make your changes and submit a pull request.

License

This project is licensed under the MIT License.

pushkarsaxena96/OCR-in-Python