/professional_document_converter

A professional document conversion tool that transforms PDFs into multiple formats including Markdown, JSON, and DOCX. Built with Docling and Gradio, this tool provides an intuitive interface for both file uploads and URL-based conversions.

Primary LanguagePythonMIT LicenseMIT

Professional Document Converter

UI Screenshot

A professional document conversion tool that transforms PDFs into multiple formats including Markdown, JSON, and DOCX. Built with Docling and Gradio, this tool provides an intuitive interface for both file uploads and URL-based conversions.

Table of Contents

Features

  • 📄 Convert PDF documents to multiple formats
  • 🔗 Support for both file uploads and URL inputs
  • 💾 Multiple output formats (Markdown, JSON, DOCX)
  • 🎯 GPU acceleration support
  • 📊 Advanced table extraction options
  • 🎨 Clean and intuitive user interface
  • âš¡ Real-time preview
  • 💻 Easy deployment options

Demo

Try out the live demo on Hugging Face Spaces: Professional Document Converter

Requirements

docling
gradio>=4.0.0
--extra-index-url https://download.pytorch.org/whl/cu118
torch
pytesseract
python-docx
markdown
requests

Installation

  1. Clone the repository:
git clone https://github.com/arad1367/professional_document_converter.git
cd professional_document_converter
  1. Create and activate a virtual environment (recommended):
python -m venv venv
source venv/bin/activate  # For Linux/Mac
# or
venv\Scripts\activate     # For Windows
  1. Install dependencies:
pip install -r requirements.txt
  1. Install Tesseract OCR:
    • Windows: Download and install from UB-Mannheim/tesseract
    • Linux: sudo apt-get install tesseract-ocr
    • macOS: brew install tesseract

Usage

Local Deployment

Run the local version with:

python local_app.py

Access the application at http://127.0.0.1:7860

Hugging Face Deployment

The application can be deployed to Hugging Face Spaces using app.py. See Hugging Face deployment documentation for details.

Project Structure

professional_document_converter/
├── app.py              # Hugging Face Spaces deployment version
├── local_app.py        # Local deployment version
├── requirements.txt    # Project dependencies
├── UI.png             # User interface screenshot
├── README.md          # Project documentation
└── .gitignore         # Git ignore configuration

Technical Details

Supported Input Formats

  • PDF files (upload)
  • PDF URLs (direct links)

Output Formats

Format Description Use Case
Markdown Clean, readable text format Documentation, note-taking
JSON Structured data format API integration, data processing
DOCX Microsoft Word format Professional document editing

Processing Options

  • GPU Acceleration: Enable/disable GPU processing
  • Table Mode:
    • Fast: Quick table extraction
    • Accurate: Detailed table structure preservation (slower)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

  • Special thanks to Docling for their amazing document processing tool
  • Built with Gradio for the user interface
  • Powered by PyTorch and Tesseract OCR

Contact

License

This project is licensed under the MIT License - see the LICENSE file for details.