/pypdf2xml

Convert text from PDF to XML.

Primary LanguagePython

pypdf2xml

This project started as an alternative to poppler's pdftoxml, which didn't properly decode CID Type2 fonts in PDFs. This script requires pdfminer.

License

Public domain.