This repository contains sample codes for Document AI of GCP. These are mainly python scripts to be copied and reused, rather than full .ipynb notebooks.
Setup and authentication instructions of Vertex SDK are available here. Please, complete those before trying any of the labs below.
This lab contains a script to make predictions with the Form parser. It uses a public pdf sample located at gs://cloud-samples-data/documentai/form.pdf
.
One of the scripts returns a pandas dataframe
with the fields detected, as well as bounding boxes, generating a result like the following:
This lab contains some scripts to make predictions with the invoice parser. It uses a public pdf sample located at gs://cloud-samples-data/documentai/invoice.pdf
.
The invoice parser, as well as other specialized processors, supports Human-in-the-loop (HITL) for reviewing. There are two ways to trigger a HITL operation: REST API or Python SDK.
- With REST API you need to invoke the
projects.locations.processors.process
method. Note the document file must be inline encoded in base64.
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://eu-documentai.googleapis.com/v1/projects/655797269815/locations/us/processors/bad52526b46aa2b6:process
- With Python SDK you need to invoke
DocumentProcessorServiceClient()
function:
client = documentai.DocumentProcessorServiceClient()
Additionally, both the invoice parser supports Enterprise Knowledge Graph (EKG) for enrichment. Normalized or enriched fields include:
- Supplier Name (supplier_name)
- Supplier Address (supplier_address)
- Date
- Number
- Price
- Phone Number (supplier_phone)
This lab contains some scripts to make predictions with the W-9 parser. It can be used for both W-8 (FACTA) and W-9 docs. The difference between W-8 and W-9 forms lies in the fact that the W-9 tax form is only required to be used by US companies or companies operating in the US.
Pretty table result from the python script:
This lab extracts tables using the form parser, documentation here. It focus only of the JSON output containing the tables information. there are two scripts:
tables.py
: extract tables from a pdf file2csv.py
: extract tables from a pdf file, and convert the JSON output to CSV
Another sample code to extract tables can be found here.
And an example that uses Pandas to convert the table to CSV here.
[1] Codelab: Use Procurement Document AI to Parse your Invoices using AI Platform Notebooks
[2] Codelab: Intro to Document AI and OCR
[3] Codelab: Specialized processors with Document AI
[4] Codelab: Human in the Loop
[5] Codelab: Form parsing
[6] Repository: Google Cloud Document AI github repository