A CLI to translate one or more subtitle file(s) using openAI API. Provide your own API key. Provide your own subtitle files.
Supported file formats:
- .srt
This code is designed to run on Python 3.9.18 and tested running on Linux and Windows.
Find the list of required libraries in requirement.txt
or /environment.yml
.
The latter file can be used in conjunction with Anaconda to set up a compatible environment.
Generate and provide your own openAI API key. Visit https://platform.openai.com/ .
Place all files into one directory. They will be translated one at a time. Optionally, you can designate a different directory to place translated files. Translated files will not override input files.
Generate and place your API key in a single plain text file.
Follow the instructions outlined above and execute the following command:
$ python translator.py -key /path/to/api_key.txt -l de
This minimum working example will translate all valid subtitles files found in the same directory it is run in, into German.
Provide the path to the api key with the -key
argument.
The language to translate into is derived from the -l
argument (in the example shown it's German).
The following parameters are required to use
(You can always run -h
for more information about all command line arguments)
:
Shortened to -key
.
Data type: str
.
Filepath for a text file that contains your open AI api key.
Shortened to -l
.
Data type: str
.
The language to translate into.
Recommended to use two character country code (ISO 3166-1: alpha-2).
Example: "de" for "Germany", "es" for Spain, etc.
For more info see: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2
Beyond the presented parameters, customize the usage with the following command line arguments:
Shortened to -tpm
.
Data type: int
.
Maximum tokens per minute to use.
Typical rate limit reached for gpt-3.5-turbo is 160000.
For more info see: https://platform.openai.com/account/rate-limits
If you use more tokens per minute than specified, the program waits for a minute.
If unspecified, no limit is used.
Shortened to -i
.
Data type: str
.
Single file or directory to translate.
If a single file is specified, only that file is translated (if a valid subtitle file).
If a directory is specified, all valid subtitle files in that directory are translated one at a time.
If unspecified, the current directory (where the code is run in) is used.
Shortened to -o
.
Data type: str
.
Directory to save the translated files in.
If unspecified, the input directory (see argument --input
) is used.
Shortened to -d
.
Data type: float
.
One subtitle line is translated at a time.
Delay (in seconds) between every API call.
Use to handle rate limit.
If unspecified, no delay is used.
Shortened to -k
.
Data type: bool
.
For every file translated, a session history is kept.
Therefore, more cohesive translations can be achieved (as the model remembers previous lines spoken).
This option raises tokens count significantly.
Keep cost and rate limits in mind!
If unspecified, no history is used.
The initial prompt for translating subtitle lines is specified in /initial_prompts.py
.
Feel free to edit this prompt to suit your desires.