PdfParser

Pdf Parser, a standalone PHP library, provides various tools to extract data from a PDF file. This package originates from smalot\pdfparser. The only functionality added here is the ability to extract coordinates of text.

Website : http://www.pdfparser.org

Test the API on our demo page.

This project is supported by Actualys.

Features

Features included :

Load/parse objects and headers
Extract meta data (author, description, ...)
Extract text from ordered pages
Support of compressed pdf
Support of MAC OS Roman charset encoding
Handling of hexa and octal encoding in text sections
PSR-0 compliant (autoloader)
PSR-1 compliant (code styling)
Extraction of coordinates of specific text on a page

Currently, secured documents are not supported.

This Library is still under active development. As a result, users must expect BC breaks when using the master version.

Documentation

Read the documentation on website.

Original PDF References files can be downloaded from this url : http://www.adobe.com/devnet/pdf/pdf_reference_archive.html

License

This library is under the LGPLv3 license.

KenorFR/pdfparser

PdfParser

Features

Documentation

License