This Python project is a PDF translation tool that leverages pre-trained machine translation models for language conversion and text extraction. It can extract text from PDF files, translate it to a target language, and generate a new translated PDF document.
- Extracts text from PDF files using PyPDF2.
- Translates extracted text to a target language using pre-trained Transformers models.
- Creates a new PDF document with the translated text using FPDF.
- Ensure you have the necessary dependencies installed, including PyPDF2, Transformers, FPDF, and PyTorch for GPU support.
- Provide your input PDF file by replacing 'input.pdf' in the code.
- Specify the target language for translation by modifying the 'target_language' variable.
- Run the code to perform the PDF translation.
- PyPDF2
- Transformers
- FPDF
- PyTorch (for GPU support)
You can install the required dependencies using pip:
pip install -r requirements.txt