/pdf2text4extensions

Extract all text from PDF, works for extensions, pure Javascript

Primary LanguageJavaScriptOtherNOASSERTION

pdf2text4extensions

Simple textextractor for pdfs

Workflow:

  1. Opens URL for pdf with XMLHTTP Request openPDF()

  2. Extracts all the text from each page: getContentPDF(). You can ditch the first step and just give it a blob file.

Background

This library is part of the WorldBrain Project. We build a search engine that allows you to full-text search through your browsing history, bookmarks, Evernote, Pocket, Google Drive etc.

Acknowledgements:

This library is made possible with the help of the PDF.js libary