Hey there! This is a super cool web app that converts PDF files to DOCX files. It's built using Python, Flask, and some awesome libraries like pdfplumber and python-docx.
- Extracts text and tables from PDFs
- Handles OCR for scanned PDFs using Tesseract
- Keeps the formatting and structure of the original PDF
- Easy-to-use web interface
- Clone this repo:
git clone https://github.com/Awis13/pdf2docx.git
- Install the required packages:
pip install -r requirements.txt
- Run the Flask app:
python app.py
- Open your browser and go to
http://localhost:5000/
.
- Create a new Heroku app.
- Connect your GitHub repo to the Heroku app.
- Deploy the
main
branch to Heroku. - Access your app using the provided Heroku URL.