PDF Project

coverage version

home_image

Here are some details that are useful:

  • The data is unstructured pdf.
  • images.py fetches images from pdf.
  • notes.py fetches notes from pdf
  • data.json is the final output in which data is shown
  • section.py is used to fetch section,figure and tables.

How to run?

To run the app you need to download this repository along with the required libraries and in the command line you have to write python file.py to run.


Document Structure

Personal Finance 
│
|---- images
|--------allimages
|
|
|   
|
|---- data.json
|---- images.py
|---- notes.py
|---- pyapi.py
|---- section.py
|---- markdown.py
|---- Procfile 
|---- README.md
|---- pdfs
|---- setup.sh


Technologies used :

  • python library - PYPDF2,AZURE
  • version control - git
  • Cloud Technologies used- Azure form RECOGNISER

Tools and Services :

  • IDE - Vs code
  • Code Repository - GitHub


If you Liked this project the you can consider connecting with me: