- Uses Azure Document Intelligence for Optical Character Recognition (OCR).
- Extracts and processes the diagnosis text specifically from handwritten medical forms (88% Accuracy).
- Saves the extracted data into an Excel file (output_diagnoses) with high accuracy and precision.
Create and activate a new virtual environment (recommended) by running the following:
python3 -m venv myvenv
myvenv\Scripts\activate
Install the dependencies:
pip install -r requirements.txt
pip install azure-cognitiveservices-vision-computervision
pip install msrest
pip install pandas
pip install pathlib
pip install pytesseract
pip install opencv-python
pip install Pillow
pip install azure-core
Run the script:
python main.py <folder_path>