/translate-pdf

Translates text in a PDF file from one language to another

Primary LanguagePythonMIT LicenseMIT

Translate text in PDF file

Python script that uses Google Translate to translate text in a PDF file

Prerequisites:

Note: Only used/tested with Python3

Google Cloud SDK

Google API Client Library for Python

pdftotext as a Python module

How To Use:

  1. Complete the Prerequisites above

  2. Acquire/generate Google API credentials and set them

  3. Execute the main.py script according to the following:

    python3 main.py <PDF_input_file> <target_language> <name_of_output_file>

Expected output:

  • Console (stdout) will print the translated text for each page
  • The named output file will contain all the translated text with a Page number heading correlated to the page found in the PDF file

Sample data:

The /sample directory contains a PDF file that can be used for testing

The file has French text of an Edgar Allen Poe poem titled "The Raven"