This Flask application utilizes Google Cloud Document AI to process document images (specifically transcripts) and extract relevant information like name, IPK (GPA), university details, etc. It offers a secure API endpoint for uploading images and receiving the extracted data in JSON format.
- Python 3.x
- Flask framework
- Google Cloud Document AI API (
google-cloud-documentai
) - Service account credentials for Google Cloud Platform (GCP)
dotenv
for managing environment variableswerkzeug
for secure file handling- (Optional) Additional libraries based on image processing needs (e.g.,
Pillow
)
-
Create a virtual environment (recommended) and activate it.
-
Install the required dependencies:
pip install flask google-cloud-documentai Flask-Cors dotenv werkzeug # (Optional) Pillow pip install Pillow
-
Create a file named
.env
in your project directory and add the following environment variables, replacing the placeholders with your actual values:GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/credentials.json PROJECT_ID=your-project-id LOCATION_ID=your-location-id PROCESSOR_ID=your-processor-id MODEL_VERSION=your-model-version API_KEY=your-api-key # (Optional, for API access control)
- Update the environment variables in
.env
with your GCP project details and API key (if used). - You may modify the list of extracted fields ("nim", "nama", "ipk", "univ", etc.) in the
process_document
function to suit your specific needs. - Consider incorporating additional validations or data cleaning steps as required.
-
Start the application:
python app.py
The application runs on
http://0.0.0.0:5000
(localhost) by default in debug mode. -
Make a POST request to the
/
endpoint with an image file in theimage
field of your multipart form data. The API key needs to be included in the request header (if configured).
curl -X POST http://localhost:5000/ \
-H "X-API-KEY: your_api_key" \
-F "image=@transcript.jpg"
{
"error": false,
"message": "Proses OCR Berhasil",
"result": {
"nim": "1234567890",
"nama": "John Doe",
"ipk": "3.8",
"univ": "University of Example",
"fakultas": "Faculty of Science",
"program_studi": "Computer Science",
"pendidikan": "Bachelor",
"pddikti": "https://pddikti.kemdikbud.go.id/api/pencarian/mhs/1234567890",
"time_elapsed": "0.123"
}
}
Implement unit tests for your functions using a framework like pytest. Manually test the API endpoint using tools like Postman or cURL as demonstrated in the “Usage” section.
run docker-compose up
version: "3"
services:
ocr:
build:
context: .
dockerfile: Dockerfile
image: ocr
container_name: ocr
environment:
GOOGLE_APPLICATION_CREDENTIALS: "your_value"
PROJECT_ID: "your_value"
LOCATION_ID: "your_value"
PROCESSOR_ID: "your_value"
MODEL_VERSION: "your_value"
API_KEY: "your_value"
restart: unless-stopped
ports:
- "8000:8000"
networks:
- ocr-network
command: gunicorn app:app -w 4 -t 90 --log-level=debug -b 0.0.0.0:8000 --reload --threads 2 --worker-class gevent --keep-alive 5 --timeout 60 --worker-connections 1000
networks:
ocr-network:
driver: bridge