/documentcloud-tabula-addon

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

DocumentCloud Tabula Add-On

This is an add-on for DocumentCloud which wraps the tabula-py library.

It allows you to extract and export tables from a set of PDFs into CSVs and download the resulting CSVs as a zip file.

You can provide a public Google Drive or Dropbox link to a template generated from the Tabula Desktop Application to run against the documents.

If no template is provided, the add-on will try to guess the boundaries of the tables within the file with varying degrees of success.