PHP-XPDF is an object oriented wrapper for XPDF. For the moment, only PdfTotext wrapper is available.
Forked to work with PHP 8 & Laravel 9... Requires my fork of BinaryDriver https://github.com/will2therich/BinaryDriver
It is recommended to install PHP-XPDF through Composer :
{
"require": {
"php-xpdf/php-xpdf": "~0.2.0"
}
}
In order to use PHP-XPDF, you need to install XPDF. Depending of your configuration, please follow the instructions at on the XPDF website.
The easiest way to instantiate the driver is to call the `create method.
$pdfToText = XPDF\PdfToText::create();
You can optionaly pass a configuration and a logger (any
Psr\Logger\LoggerInterface
).
$pdfToText = XPDF\PdfToText::create(array(
'pdftotext.binaries' => '/opt/local/xpdf/bin/pdftotext',
'pdftotext.timeout' => 30, // timeout for the underlying process
), $logger);
To extract text from PDF, use the getText
method.
$text = $pdtTotext->getText('document.pdf');
You can optionally extract from a page to another page.
$text = $pdtTotext->getText('document.pdf', $from = 1, $to = 4);
You can also predefined how much pages would be extracted on any call.
$pdtTotext->setpageQuantity(2);
$pdtTotext->getText('document.pdf'); // extracts page 1 and 2
A Silex service provider is available
$app = new Silex\Application();
$app->register(new XPDF\XPDFServiceProvider());
$app['xpdf.pdftotext']->getText('document.pdf');
Options can be passed to customize the provider.
$app->register(new XPDF\XPDFServiceProvider(), array(
'xpdf.configuration' => array(
'pdftotext.timeout' => 30,
'pdftotext.binaries' => '/opt/local/xpdf/bin/pdftotext',
),
'xpdf.logger' => $logger,
));
This project is licensed under the MIT license.