PDF Text Plugin
This plugin for Flutter allows you to read the text content of PDF documents and convert it into strings. It works on iOS and Android. On iOS it uses Apple's PDFKit. On Android it uses Apache's PdfBox Android porting.
Getting Started
Add this to your package's pubspec.yaml
file:
dependencies:
pdf_text: ^0.3.1
Usage
Import the package with:
import 'package:pdf_text/pdf_text.dart';
Create a PDF document instance using a File object:
PDFDoc doc = await PDFDoc.fromFile(file);
or using a path string:
PDFDoc doc = await PDFDoc.fromPath(path);
or using a URL string:
PDFDoc doc = await PDFDoc.fromURL(url);
Pass a password for encrypted PDF documents:
PDFDoc doc = await PDFDoc.fromFile(file, password: password);
Use faster initialization on Android:
PDFDoc doc = await PDFDoc.fromFile(file, fastInit: true);
Read the text of the entire document:
String docText = await doc.text;
Retrieve the number of pages of the document:
int numPages = doc.length;
Access a page of the document:
PDFPage page = doc.pageAt(pageNumber);
Read the text of a page of the document:
String pageText = await page.text;
Read the information of the document:
PDFDocInfo info = doc.info;
Optionally, you can delete the file of a document when you no longer need it. This can be useful when you import a PDF document from outside the local file system (e.g using a URL), since it is automatically stored in the temporary directory of the app.
Delete the file of a single document:
doc.deleteFile();
or delete all the files of all the documents imported from outside the local file system:
PDFDoc.deleteAllExternalFiles();
Functioning
This plugin applies lazy loading for the text contents of the pages. The text is cached page per page. When you request the text of a page for the first time, it is parsed and stored in memory, so that the second access will be faster. Anyway, the text of pages that are not requested is not loaded. This mechanism allows you not to waste time loading text that you will probably not use. When you request the text content of the entire document, only the pages that have not been loaded yet are then loaded.
Public Methods
PDFDoc
Return | Description |
---|---|
PDFPage | pageAt(int pageNumber) Gets the page of the document at the given page number. |
static Future<PDFDoc> | fromFile(File file, {String password = "", bool fastInit = false}) Creates a PDFDoc object with a File instance. Optionally, takes a password for encrypted PDF documents. If fastInit is true, the initialization of the document will be faster on Android. In that case, the text stripper engine will not be initialized with this call, but later when some text is read. This means that the first text read will take some time but the document data can be accessed immediately. |
static Future<PDFDoc> | fromPath(String path, {String password = "", bool fastInit = false}) Creates a PDFDoc object with a file path. Optionally, takes a password for encrypted PDF documents. If fastInit is true, the initialization of the document will be faster on Android. In that case, the text stripper engine will not be initialized with this call, but later when some text is read. This means that the first text read will take some time but the document data can be accessed immediately. |
static Future<PDFDoc> | fromURL(String url, {String password = "", bool fastInit = false}) Creates a PDFDoc object with a url. Optionally, takes a password for encrypted PDF documents. If fastInit is true, the initialization of the document will be faster on Android. In that case, the text stripper engine will not be initialized with this call, but later when some text is read. This means that the first text read will take some time but the document data can be accessed immediately. It downloads the PDF file located in the given URL and saves it in the app's temporary directory. |
void | deleteFile() Deletes the file related to this PDFDoc. Throws an exception if the FileSystemEntity cannot be deleted. |
static Future | deleteAllExternalFiles() Deletes all the files of the documents that have been imported from outside the local file system (e.g. using fromURL). |
Objects
class PDFDoc {
int length; // Number of pages of the document
List<PDFPage> pages; // Pages of the document
Future<String> text; // Text of the document
}
class PDFPage {
int number; // Number of the page in the document
Future<String> text; // Text of the page
}
class PDFDocInfo {
String author; // Author string of the document
List<String> authors; // Authors of the document
DateTime creationDate; // Creation date of the document
DateTime modificationDate; // Modification date of the document
String creator; // Creator of the document
String producer; // Producer of the document
List<String> keywords; // Keywords of the document
String title; // Title of the document
String subject; // Subject of the document
}
Contribute
If you have any suggestions, improvements or issues, feel free to contribute to this project. You can either submit a new issue or propose a pull request.