/jekyll-chatgpt-translate

Automated translating of Jekyll pages via ChatGPT: all you need is just an OpenAI API key

Primary LanguageRubyMIT LicenseMIT

Translator of Jekyll Pages via ChatGPT

logo

rake Gem Version

If you have a Jekyll static site, this plugin may help you automatically translate its pages to another language, through ChatGPT. See how it works for my blog, for example this page is translated to English.

Install it first (you need Ruby 3+ and Jekyll 3+):

gem install jekyll-chatgpt-translate

Then, add this to _config.yml:

plugins:
  - ... your other plugins here ...
  - jekyll-chatgpt-translate
chatgpt-translate:
  model: gpt-3.5-turbo
  source: en
  layout: translated
  targets: 
    - 
      language: zh
      permalink: :year-:month-:day-:slug-chinese.html
      layout: chinese-translated
    - 
      only: ru-post
      language: fr
      permalink: :year-:month-:day-:title-french.html

Here, the source language is English (en), the targets are Chinese (zh) and French (fr), where the layout for Chinese is _layout/chinese-translated.html and for French is _layout/translated.html (you must have these files).

OpenAI API KEY must be set in the OPENAI_API_KEY environment variable, otherwise the plugin will not do any translation and won't generate translated pages. You can get your key here.

OpenAI API base URL can be customized by the OPENAI_API_BASE environment variable. If this variable is not set, the default value is https://api.openai.com/.

Inside the original page you can use {{ page.chatgpt-translate.urls[XX] }} in order to render the URL of the translated page, where XX is the ISO-639-1 code of the target language. Inside the translated page you can use {{ page.chatgpt-translate.original-url }} in order to get the URL of the page that was translated.

You can also use {{ page.chatgpt-translate.model }} inside both the original page and the translated one, to refer to the model of ChatGPT. The presence of {{ page.chatgpt-translate }} means that the page was translated or the translated HTML was downloaded and placed into the _site directory.

Options

Full list of options available to specify in _config.yml:

  • api_key_file (optional) — the file with OpenAI API key. If this option is not specified, it is expected to have the key in the OPENAI_API_KEY environment variable.

  • api_key (optional) — the OpenAI API key itself. This is a very bad idea to specify it right in the _config.yml file, but it's still possible.

  • model (optional) — specifies the model to use by ChatGPT, examples are here.

  • source (optional) — is the ISO-639-1 code of the source language.

  • no_download (optional) — if this attribute is present, the plugin won't try to find HTML versions of translated pages in the Internet and won't try to download them and place into the _site directory. Thus, your entire site will have to be re-translated on every build (might be very ineffective if the site is big!)

  • min_chars (optional) — minimum number of chars that must be present in a paragraph in order for it to be feasible to go to ChatGPT. The robot doesn't translate short paragraphs pretty enough. It's better to keep this number big enough, to avoid silly translations. The default is 128.

  • window_length (optional) — maximum number of words to be sent to OpenAI API in one request. The default is 2048.

  • layout (optional) — is name of the file in _layouts directory, without the extension. This layout will be specified for the pages generated by this plugin. The default value is translated (expecting you to have _layouts/translated.html file available).

  • targets (mandatory) — an array of target languages, each of which has the following attributes

    • only (optional) — it this is present, only the posts with the provided "layout" will be translated to this target

    • language (mandatory) — ISO-639-1 code of the target language

    • source (optional) — ISO-639-1 code of the source language (overwrites the value of the source defined above)

    • permalink (mandatory) — template to use for newly generated pages

    • layout (optional) — the name of the file in the _layouts directory

  • threshold (optional) — maximum number of pages to generate in one build cycle. The default value is 1024. It is recommended to use smaller number, in order to avoid too long builds. You can re-run the build again and missing pages will be generated. Thus, in a few builds the entire site will be translated.

  • version (optional) — the version that will be attached to each generated page, in order to avoid repetitive translations on one hand and enable re-translations when the version is changed on another hand. By default, the version of this plugin will be used, unless you set your own value.

  • tmpdir (optional) — the name of the directory where to keep temporary files, _chatgpt-translate is the default value.

How to Contribute

Make a fork and then test it locally like this:

bundle update
bundle exec rake

If it works, make changes, test again, and then submit a pull request.

In order to run a single test, do this:

bundle exec ruby test/test_generator.rb