/pdf_to_html

converts all pdfs from a folder to text files

Primary LanguageJupyter Notebook

pdf_to_html

converts all pdfs from a folder to text files

In 2016 Masha Gorkovenko created a tutorial for converting PDFs to Text for Stanford.

Using a Python 2.7 and PDFMiner, I have added coding to create better text filenames.