Docscan is a lightweight document scanner. It allows users to open up document types and return the information inside as strings via regex.
Requirements:
- zipfile
- io
- re
- XML
Usage: Note: fileName must be in the directory Example: DocuScan("C:\Users\You\Desktop\folder1\test.pdf")
- Instantiate
class Docscan('fileName')
. - use
print(variable.returnFileText())
- use
print(variable.executeRegex('regex here'))
- use
print(executeHeaderRegex('regex here'))
- use
print(executeFooterRegex('regex here'))
Methods:
returnFileText()
- Returns the text of a file.executeRegex(regexExpression)
- creates a list of all matching cases of regexExpressionexecuteHeaderRegex(regularExpression)
- creates a list of all matching cases of regexExpression in the header XML.executeFooterRegex(regularExpression)
- creates a list of all matching cases of regexExpression in the Footer XML.