/PHP-XPDF

An object oriented wrapper for XPDF

Primary LanguagePHPOtherNOASSERTION

PHP-XPDF

Build Status

PHP-XPDF is an object oriented wrapper for XPDF. For the moment, only PdfTotext wrapper is available.

Installation

It is recommended to install PHP-XPDF through Composer :

{
    "require": {
        "php-xpdf/php-xpdf": "~0.2.0"
    }
}

Dependencies :

In order to use PHP-XPDF, you need to install XPDF. Depending of your configuration, please follow the instructions at on the XPDF website.

Documentation

Driver Initialization

The easiest way to instantiate the driver is to call the `create method.

$pdfToText = XPDF\PdfToText::create();

You can optionaly pass a configuration and a logger (any Psr\Logger\LoggerInterface).

$pdfToText = XPDF\PdfToText::create(array(
    'pdftotext.binaries' => '/opt/local/xpdf/bin/pdftotext',
    'pdftotext.timeout' => 30, // timeout for the underlying process
), $logger);

Extract text

To extract text from PDF, use the getText method.

$text = $pdtTotext->getText('document.pdf');

You can optionally extract from a page to another page.

$text = $pdtTotext->getText('document.pdf', $from = 1, $to = 4);

You can also predefined how much pages would be extracted on any call.

$pdtTotext->setpageQuantity(2);
$pdtTotext->getText('document.pdf'); // extracts page 1 and 2

Use with Silex

A Silex service provider is available

$app = new Silex\Application();
$app->register(new XPDF\XPDFServiceProvider());

$app['xpdf.pdftotext']->getText('document.pdf');

Options can be passed to customize the provider.

$app->register(new XPDF\XPDFServiceProvider(), array(
    'xpdf.configuration' => array(
        'pdftotext.timeout'  => 30,
        'pdftotext.binaries' => '/opt/local/xpdf/bin/pdftotext',
    ),
    'xpdf.logger' => $logger,
));

License

This project is licensed under the MIT license.