Multilingual Experiment in Jekyll
The basic GitHub Pages site hosted in this repository and this README illustrate my approach to create a multilingual site in Jekyll.
Table of Contents
- Preface
- Foreword
- Directory Structure
- Front Matter
- Data Files
- Includes
- Multilingual Sitemaps
- RSS Feed
- 404 Page Not Found
- Resources
- Afterword
Preface
When I found myself coding a multilingual site in Jekyll, I stumbled on a lot of useful resources while surfing the Web, but I struggled not a little while trying to digest and replicate their approaches because of the lack of a concrete, working example to look at.
At first, I tried to replicate their approaches directly in the site I was working on, but this quickly backfired because it proved to be too big of a bite to chew for a designer who codes.
Not giving up, I then opted for creating a basic site from scratch, so that I could just focus on experimenting with multiple languages in Jekyll without any extra complexity in the picture.
That very same basic site is hosted in this repository, which I gladly share with the world as an example project, hoping to be of a help for anybody who is into coding a multilingual site using Jekyll.
Foreword
A few words before starting. Sites built with the approach illustrated in this README:
- can support as many languages as needed
- can serve pages or posts that do not necessarily need to be translated in all the supported languages
- have a language switch that can either direct web surfers to view the current page or post in the selected language, if available, or can direct them to an alternative fallback page
- do not need you to install custom plugins
- leverage the basics of Jekyll and thus should be relatively future-proof (last famous words)
- can be published as GitHub Pages sites
Specifically, the basic site hosted in this repository and used as an example:
- is visually quite crude, since the focus is on illustrating a structural (not visual) approach to building multilingual sites
- supports English and Italian as example languages
Directory Structure
The directory structure of this basic site looks like this:
.
├── _data
│ └── snippets.yml
├── _includes
│ ├── head.html
│ ├── header.html
│ ├── localizations.html
│ └── references.html
├── _layouts
│ ├── base.html
│ ├── index.html
│ ├── page.html
│ └── post.html
├── _posts
│ ├── en
│ │ ├── YYYY-MM-DD-title.markdown
│ │ ├── …
│ │ └── YYYY-MM-DD-title.markdown
│ └── it
│ ├── YYYY-MM-DD-titolo.markdown
│ ├── …
│ └── YYYY-MM-DD-titolo.markdown
├── 404.html
├── config.yml
├── en
│ ├── drafts.html
│ ├── feed.xml
│ ├── postface.html
│ ├── preface.html
│ ├── sitemap.xml
│ └── stories.html
├── index.html
├── it
│ ├── bozze.html
│ ├── feed.xml
│ ├── prefazione.html
│ ├── sitemap.xml
│ └── storie.html
└── sitemap.xml
Pages
We organize the pages into as many subdirectory as the languages that we plan to support, and name them using ISO language codes. This basic site has two subdirectories, one named en
for grouping the English pages, and one named it
for grouping the Italian pages.
├── …
├── en
│ ├── drafts.html
│ ├── feed.xml
│ ├── postface.html
│ ├── preface.html
│ ├── sitemap.xml
│ └── stories.html
├── …
├── it
│ ├── bozze.html
│ ├── feed.xml
│ ├── prefazione.html
│ ├── sitemap.xml
│ └── storie.html
├── …
After Jekyll has built the site, we can reach, for example, the English page stories.html
and the Italian page storie.html
at the URLs www.site.ext/en/stories.html
and www.site.ext/it/storie.html
, respectively.
Exceptions
But, of course, there are exceptions. We place the pages 404.html
, index.html
, and sitemap.html
in the root directory of the site. Why?
404.html
and index.html
are unique pages because Jekyll builds and serves automatically one and only one of them at a time.
sitemap.xml
instead is none other than a Sitemap index which points to the other localized sitemaps in the respective language subfolders (read the section Multilingual Sitemap for more details).
Posts
We organize the posts following a similar logic. This basic site has two subdirectories in the folder named _posts
, one named en
for grouping the English posts, and one named it
for grouping the Italian posts.
├── …
├── _posts
│ ├── en
│ │ ├── YYYY-MM-DD-title.markdown
│ │ ├── …
│ │ └── YYYY-MM-DD-title.markdown
│ └── it
│ ├── YYYY-MM-DD-titolo.markdown
│ ├── …
│ └── YYYY-MM-DD-titolo.markdown
├── …
Configuration
We then add the following configuration options in the _config.yml
file placed in the site’s root directory:
defaults:
-
scope:
path: '_posts/en'
type: 'posts'
values:
permalink: 'en/story/:title'
language: en
-
scope:
path: '_posts/it'
type: 'posts'
values:
permalink: 'it/storia/:title'
language: it
By setting global permalinks for posts, we can reach, for example, the English post named 2021-01-01-hello-world.markdown
and the Italian post named 2021-01-01-ciao-mondo.markdown
at the URLs www.site.ext/en/hello-world.html
and www.site.ext/it/ciao-mondo.html
, respectively.
Front Matter
Pages
Here is how the front matter of a page looks like:
---
layout: page
title: Stories
description: Stories.
language: en
language_reference: stories
published: true
---
But for the usual variables, we set two new ones, language
to define the language of the page, and language_reference
to relate different translations of the same page. The logic is based on the principle articulated in Sylvain Durand’s Making Jekyll Multilingual.
For example, here is the front matter of the English page Stories:
---
layout: page
title: Stories
description: Stories.
language: en
language_reference: stories
published: true
---
and here is the front matter of its Italian counterpart:
---
layout: page
title: Storie
description: Storie.
language: it
language_reference: stories
published: true
---
Both pages have the variable language_reference
set to stories
so that they can be easily related.
We can use language
to retrieve only the pages that have the same language, and language_reference
to retrieve only the pages that return the same content translated in different languages.
Posts
Here is how the front matter of a post looks like:
---
layout: post
title: Hello World
description: Hello world.
date: 2021-01-01 00:00:00
language: en
language_reference: world
published: true
---
Again, but for the usual variables, we set two new ones, language
to define the language of the post, and language_reference
to relate different translations of the same post.
For example, here is the front matter of the English post Hello World:
---
layout: post
title: Hello World
description: Hello world.
date: 2021-01-01 00:00:00
language: en
language_reference: world
published: true
---
and here is the front matter of its Italian counterpart:
---
layout: post
title: Ciao Mondo
description: Ciao Mondo.
date: 2021-01-01 00:00:00
language: it
language_reference: world
published: true
---
Both posts have the variable language_reference
set to world
so that they can be easily related.
Again, we can use language
to retrieve only the posts that have the same language, and language_reference
to retrieve only the posts that return the same content translated in different languages.
Data Files
Snippets
We create a YAML Data File named snippets.yml
to store the different translations of the user interface copy as additional data in the _data
subdirectory.
We then create a new variable named snippets
in the base.html
layout to shorten the code that we need to write to access the data contained in the snippets.yml
file:
{%- assign snippets = site.data.snippets %}
Since the base.html
layout works as the base for all the other layouts, if we place the variable snippets
there, we can then call it from any page.
Through this variable, we can write just snippets.name_of_the_data_item
when accessing a data item rather than the full, longer site.data.snippets.name_of_the_data_item
.
For example, the piece of code that generates Back to the Top link at the bottom of the page:
<a href="#{{ snippets.top[page.language] | slugify: 'latin' }}">{{ snippets.back[page.language] }}</a>
uses the following variable:
{{ snippets.back[page.language] }}
to retrieve the name of the link in the current selected language from the following lines in the snippets.yml
data file:
back:
en: Back to the Top
it: Torna in Cima
top:
en: Top
it: Cima
Includes
The purpose of most of the includes in this basic site is building the navigation.
header.html
The include header.html
generates the header in the HTML page. It, in turn, has three more includes:
title.html
navigation.html
language-switch.html
<header>
{% include site-title.html %}
<nav>
{% include navigation.html %}
{% include language-switch.html %}
</nav>
</header>
navigation.html
The include navigation.html
generates an unordered list containing all the published pages having the same language
variable as the current page.
<ul>
{%- assign navigation_pages = site.pages
| where: 'layout', 'page'
| where: 'language', page.language
| where: 'published', true
| sort: 'order' %}
{%- for navigation_page in navigation_pages %}
<li{%- if navigation_page.title == page.title %} class="current"{%- endif %}>
<a href="{{ site.baseurl }}{{ navigation_page.url }}">{{ navigation_page.title }}</a>
</li>
{%- endfor %}
</ul>
In the code above, we create a new variable named navigation_pages
which returns a list of the pages that, in their front matter, have:
- the
layout
variable set topage
- the
language
variable set to the language of the current page (page.language
) - the
published
variable set totrue
and we order the list according to the order
variable. We then loop trough the array of pages and generate the list items of the unordered list.
Whenever the title of the current page in the array (navigation_page.title
) matches the title of the current page (page.title
), we add a class named current
to the corresponding <li/>
tag.
language-switch.html
The include language-switch.html
generates an unordered list containing all the languages supported in the site. You can use the list to switch to one of the other language translations of the current page/post, if available.
<ul>
{%- for language in snippets.languages %}
{%- if page.layout == 'page' %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', page.language_reference
| where: 'language', language[1].slug %}
{%- if navigation_pages.size == 1 %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
{%- else %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', site.fallback_page
| where: 'language', language[1].slug %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
{%- endif %}
{%- elsif page.layout == 'post' %}
{%- assign navigation_posts = site.posts
| where: 'language_reference', page.language_reference
| where: 'language', language[1].slug %}
{%- if navigation_posts.size == 1 %}
{%- for navigation_post in navigation_posts %}
{%- assign url = site.baseurl | append: navigation_post.url %}
{%- endfor %}
{%- else %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', site.fallback_page
| where: 'language', language[1].slug %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
{%- endif %}
{%- else %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', site.fallback_page
| where: 'language', language[1].slug %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
{%- endif %}
<li{%- if language[1].slug == page.language %} class="current"{%- endif %}>
<a href="{{ url }}">{{ language[1].value }}</a>
</li>
{%- endfor %}
</ul>
In the code above, we loop through the languages defined in the snippets.html
file (read the section Snippets for more details).
languages:
en:
value: English
slug: en
it:
value: Italian
slug: it
The for loop contains three different code blocks that are run only if specific conditions are met. If we were to look only at its high-level structure:
<ul>
{%- for language in snippets.languages %}
{%- if page.layout == 'page' %}
<!-- first code block -->
{%- elsif page.layout == 'post' %}
<!-- second code block -->
{%- else %}
<!-- third code block -->
{%- endif %}
<li {%- if language[1].slug == page.language %} class="current"{%- endif %}>
<a href="{{ url }}">{{ language[1].value }}</a>
</li>
{%- endfor %}
</ul>
We run the first block of code only if the layout
variable of the current page is set to page
, else, if it is set to post
, we run the second block of code, else, if it is set to anything else (or to nothing at all), we run the third block of code.
After at least one of the code blocks has been run, we generate the list items of the unordered list.
Whenever the slug of the current language item of the array snippets.languages
(language[1].slug
) matches the language of the current page (page.language
), we add a class named current
to the corresponding <li/>
tag.
if page.layout == 'page'
{%- if page.layout == 'page' %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', page.language_reference
| where: 'language', language[1].slug %}
{%- if navigation_pages.size == 1 %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
{%- else %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', site.fallback_page
| where: 'language', language[1].slug %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
{%- endif %}
What does the first block of code do?
{%- assign navigation_pages = site.pages
| where: 'language_reference', page.language_reference
| where: 'language', language[1].slug %}
We create a new variable named navigation_pages
which returns a list of the pages that, in their front matter, have:
- the
language_reference
variable equal to the current page’slanguage_reference
variable (page.language_reference
) - the
language
variable equal to the slug of the current language item (language[1].slug
) in the arraysnippets.languages
If we set the front matter of the pages correctly, the size of the array navigation_pages
should be:
- either equal to one if the current page has a corresponding page translated in the current language item of the array
snippets.languages
- or equal to zero if the current page does not have a corresponding page translated in the current language item of the array
snippets.languages
{%- if navigation_pages.size == 1 %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
If the size of the array navigation_pages
is equal to one, we loop through the array navigation_pages
and create a new variable named url
by combining the site.baseurl
(defined in the _config.yml
file) and the url of the one page (navigation_page.url
) contained in the array navigation_pages
.
{%- else %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', site.fallback_page
| where: 'language', language[1].slug %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
{%- endif %}
If instead, the size of the array navigation_pages
is equal to zero (or more than one, which is trouble), we do not have a corresponding page in the current language item of the array snippets.languages
to switch to.
Thus, we provide a fallback page (site.fallback_page
) so that web surfers who interact with the language switch and press on a language that does not support the current page are at least redirected to a meaningful page in the language they selected.
We set the fallback_page
in the _config.yml
file placed in the site’s root directory:
fallback_page: 'stories'
The fallback pages of this basic site are those whose language_reference
variable is set to stories
.
Why stories
? Because the pages whose language_reference
variable is set to stories
work as home pages, since they:
- return a list of all the published posts (they have exactly the same structure as the
index.html
page) - have a translated counterpart in all the languages supported on the site
elsif page.layout == 'post'
{%- elsif page.layout == 'post' %}
{%- assign navigation_posts = site.posts
| where: 'language_reference', page.language_reference
| where: 'language', language[1].slug %}
{%- if navigation_posts.size == 1 %}
{%- for navigation_post in navigation_posts %}
{%- assign url = site.baseurl | append: navigation_post.url %}
{%- endfor %}
{%- else %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', site.fallback_page
| where: 'language', language[1].slug %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
{%- endif %}
The second block of code behaves akin to the first, with the only difference that we manipulate an array of posts (navigation_posts
) rather than one of pages (navigation_pages
).
else
{%- else %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', site.fallback_page
| where: 'language', language[1].slug %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
The third block of code runs in the remote eventuality in which both the first and second blocks of code are not run, so that we make sure, again, to serve a fallback page to our web surfers.
Fallback Page
How can we be sure that the fallback page truly works?
In this basic site, not all the pages and posts are translated into all the supported languages—on purpose.
Pages
English | Italian |
---|---|
preface.html | prefazione.html |
stories.html | storie.html |
postface.html | — |
If you go to the English page Postface and press on Italian in the language switch, you can see that you are indeed redirected to the Italian page Storie.
Posts
English | Italian |
---|---|
hello-world.markdown | ciao-mondo.markdown |
hello-mars.markdown | ciao-marte.markdown |
— | ciao-giove.markdown |
Similarly, if you go to the Italian post Ciao Giove and press on English in the language switch, you can see that you are indeed redirected to the English page Stories.
title.html
The include title.html
generates the title of this basic site.
{%- if page.language == site.default_language %}
{%- assign url = site.baseurl | append: '/'%}
{%- else %}
{%- assign navigation_pages = site.pages
| where: 'language_reference', site.fallback_page
| where: 'language', page.language %}
{%- for navigation_page in navigation_pages %}
{%- assign url = site.baseurl | append: navigation_page.url %}
{%- endfor %}
{%- endif %}
<h1>
<a href="{{ url }}" {%- if page.url == '/' %} class="current"{%- endif %}>{{ site.title }}</a>
</h1>
Again, we have two different code blocks that are run only if specific conditions are met.
We run the first code block when the language of the current page (page.language
) is equal to the default language (site.default_language
) defined in the _config.yml
file. Through it we create a new variable named url
by combining the site.baseurl
(defined in the _config.yml
file) and /
, that is, the domain name of the site. Web surfers who browse the site in the default language are directed to the main page when they press on the title.
Else, we run the second code block to provide the usual fallback page already discussed above (read the section language-switch.html for more details). Web surfers who browse the site in a language different than the default one are directed to the fallback page in their current language when they press on the title.
localizations.html
The include localizations.html
adds <link rel="alternate" … />
tags in the <head/>
tag of a page to tell search engines if there are multiple versions of the page for different languages or regions.
{%- if page.layout == 'page' %}
{%- assign localized_pages = site.pages
| where: 'language_reference', page.language_reference
| sort: 'language' %}
{%- for localized_page in localized_pages %}
<link rel="alternate" hreflang="{{ localized_page.language }}" href="{{ site.baseurl }}{{ localized_page.url }}" />
{%- endfor %}
{%- elsif page.layout == 'post' %}
{%- assign localized_posts = site.posts
| where: 'language_reference', page.language_reference
| sort: 'language' %}
{%- for localized_post in localized_posts %}
<link rel="alternate" hreflang="{{ localized_post.language }}" href="{{ site.baseurl }}{{ localized_post.url }}" />
{%- endfor %}
{%- elsif page.layout == 'index' %}
{%- assign localized_pages = site.pages
| where: 'language_reference', site.fallback_page
| sort: 'language' %}
{%- for localized_page in localized_pages %}
<link rel="alternate" hreflang="{{ localized_page.language }}" href="{{ site.baseurl }}{{ localized_page.url }}" />
{%- endfor %}
{%- endif %}
Again, we have three different code blocks that are run only if specific conditions are met (read the section language-switch.html for more details).
Multilingual Sitemaps
To serve a multilingual sitemap, we need to create a Sitemap index file and list a Sitemap file for each language we support.
Sitemap Index File
We place the page named sitemap.html
in the root directory of the site. It points to the other localized sitemaps in the respective language subfolders.
---
layout: none
sitemap:
excluded: true
---
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{%- assign pages = site.pages | where: 'language_reference', 'sitemap' %}
{%- for page in pages %}
<sitemap>
<loc>{{ site.absoluteurl }}{{ page.url | remove: 'index.html' }}</loc>
{%- if page.sitemap.lastmod %}
{%- assign lastmod = page.sitemap.lastmod | date: '%Y-%m-%d' %}
{%- elsif page.date %}
{%- assign lastmod = page.date | date_to_xmlschema %}
{%- else %}
{%- assign lastmod = site.time | date_to_xmlschema %}
{%- endif %}
<lastmod>{{ lastmod }}</lastmod>
</sitemap>
{%- endfor %}
</sitemapindex>
By setting the following variables in the front matter of the Sitemap index file:
sitemap:
excluded: true
we make sure to exclude it from the list of pages returned in the other Sitemap files.
Sitemap Files
---
layout: none
title: English Sitemap
language: en
language_reference: sitemap
sitemap:
excluded: true
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{%- assign posts = site.posts | sort: 'date' | where: 'language', page.language | where: 'published', true %}
{%- for post in posts reversed %}
{%- unless post.sitemap.excluded == true %}
<url>
<loc>{{ site.absoluteurl }}{{ post.url }}</loc>
{%- if post.sitemap.lastmod %}
{%- assign lastmod = post.sitemap.lastmod | date: '%Y-%m-%d' %}
{%- elsif post.date %}
{%- assign lastmod = post.date | date_to_xmlschema %}
{%- else %}
{%- assign lastmod = site.time | date_to_xmlschema %}
{%- endif %}
<lastmod>{{ lastmod }}</lastmod>
{%- if post.sitemap.changefreq %}
{%- assign changefreq = post.sitemap.changefreq %}
{%- else %}
{%- assign changefreq = 'monthly' %}
{%- endif %}
<changefreq>{{ changefreq }}</changefreq>
{%- if post.sitemap.priority %}
{%- assign priority = post.sitemap.priority %}
{%- else %}
{%- assign priority = 0.5 %}
{%- endif %}
<priority>{{ priority }}</priority>
</url>
{%- endunless %}
{%- endfor %}
{%- assign pages = site.pages | where: 'language', page.language %}
{%- for page in pages %}
{%- unless page.sitemap.excluded == true %}
<url>
<loc>{{ site.absoluteurl }}{{ page.url | remove: 'index.html' }}</loc>
{%- if post.sitemap.lastmod %}
{%- assign lastmod = page.sitemap.lastmod | date: '%Y-%m-%d' %}
{%- elsif post.date %}
{%- assign lastmod = page.date | date_to_xmlschema %}
{%- else %}
{%- assign lastmod = site.time | date_to_xmlschema %}
{%- endif %}
<lastmod>{{ lastmod }}</lastmod>
{%- if page.sitemap.changefreq %}
{%- assign changefreq = page.sitemap.changefreq %}
{%- else %}
{%- assign changefreq = 'monthly' %}
{%- endif %}
<changefreq>{{ changefreq }}</changefreq>
{%- if page.sitemap.priority %}
{%- assign priority = page.sitemap.priority %}
{%- else %}
{%- assign priority = 0.3 %}
{%- endif %}
<priority>{{ priority }}</priority>
</url>
{%- endunless %}
{%- endfor %}
</urlset>
---
…
sitemap:
lastmod: true
changefreq: 'monthly'
priority: ''
---
RSS Feed
Coming soon…
404 Page Not Found
Coming soon…
Resources
Afterword
If you feel like adding something to the subject and/or you have spotted something worth fixing, please feel free to either drop me a line or create an issue on GitHub: thoughts, critiques, suggestions are all more than welcomed.
Thank you!