paramsiddharth/pdf2text

Implement editable text extraction

Closed this issue · 6 comments

Editable Text Extraction

The extract function within editable.py has yet to be implemented.

def extract(filename):
	... # TODO

Details

  • Since file validation checks have already been implemented in main.py, filename may be assumed to be a valid file. The directories ./out/ and ./out/imgs/ may also be assumed to be present.
  • Desired import statements must be added.
  • The function must use PyPDF2 to parse and extract text from the PDF file and store it in out/output.txt.
  • It must log of all the steps to the user in the console window.

Reference

Refer to PDF File Reader (A simple Python application that reads out textual PDF files) for the implementation details. All the available modules are mentioned in requirements.txt.

I can implement it.

I can implement it.

बहुत ख़ूब ! मैं यह मुद्दा अब तुम्हारे अधिकृत करता हूँ । 🥇

please assign me this work

@vanamayaswanth Assigning you this issue.

tmedha commented

I would like to work on this. Please assign this work to me.

@tmedha Assigning you. Good luck!