/PDFparser_TextpropertyExtractor

A basic E-PDF parser that extracts all the Text Properties. Those include the Text, Text Font, Text Style, Text Size, Text Color. The parser performs also performs Data pre-processing by removing stopwords and punctuation.

Primary LanguagePython

PDFparser_TextpropertyExtractor

A basic E-PDF parser that extracts all the Text Properties. Those include the Text, Text Font, Text Style, Text Size, Text Color. The parser performs also performs Data pre-processing by removing stopwords and punctuation.

How to Run code

Download the file and pass the document to be parsed and make sure the document is in the same file explorer.