/SubtitleTranslatorAI

Automatically translate your subtitle file(s) in bulk using AI! 🇺🇳

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

Subtitle Translator AI

A CLI to translate one or more subtitle file(s) using openAI API. Provide your own API key. Provide your own subtitle files.

Supported file formats:

  • .srt

Setup and Install

Python Environment

This code is designed to run on Python 3.9.18 and tested running on Linux and Windows. Find the list of required libraries in requirement.txt or /environment.yml. The latter file can be used in conjunction with Anaconda to set up a compatible environment.

API Key

Generate and provide your own openAI API key. Visit https://platform.openai.com/ .

File Structure

Place all files into one directory. They will be translated one at a time. Optionally, you can designate a different directory to place translated files. Translated files will not override input files.

Generate and place your API key in a single plain text file.

Usage

Follow the instructions outlined above and execute the following command:

$ python translator.py -key /path/to/api_key.txt -l de

This minimum working example will translate all valid subtitles files found in the same directory it is run in, into German. Provide the path to the api key with the -key argument. The language to translate into is derived from the -l argument (in the example shown it's German).

Required Parameters

The following parameters are required to use (You can always run -h for more information about all command line arguments) :

--api_key

Shortened to -key. Data type: str. Filepath for a text file that contains your open AI api key.

--language

Shortened to -l. Data type: str. The language to translate into. Recommended to use two character country code (ISO 3166-1: alpha-2). Example: "de" for "Germany", "es" for Spain, etc. For more info see: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2

Optional Parameters

Beyond the presented parameters, customize the usage with the following command line arguments:

--tokens_per_minute

Shortened to -tpm. Data type: int. Maximum tokens per minute to use. Typical rate limit reached for gpt-3.5-turbo is 160000. For more info see: https://platform.openai.com/account/rate-limits If you use more tokens per minute than specified, the program waits for a minute. If unspecified, no limit is used.

--input

Shortened to -i. Data type: str. Single file or directory to translate. If a single file is specified, only that file is translated (if a valid subtitle file). If a directory is specified, all valid subtitle files in that directory are translated one at a time. If unspecified, the current directory (where the code is run in) is used.

--output

Shortened to -o. Data type: str. Directory to save the translated files in. If unspecified, the input directory (see argument --input) is used.

--delay

Shortened to -d. Data type: float. One subtitle line is translated at a time. Delay (in seconds) between every API call. Use to handle rate limit. If unspecified, no delay is used.

--keep_history

Shortened to -k. Data type: bool. For every file translated, a session history is kept. Therefore, more cohesive translations can be achieved (as the model remembers previous lines spoken). This option raises tokens count significantly. Keep cost and rate limits in mind! If unspecified, no history is used.

Customize prompts

The initial prompt for translating subtitle lines is specified in /initial_prompts.py. Feel free to edit this prompt to suit your desires.