ivanhofer/typesafe-i18n-demo-sveltekit

Potential solution for "missing cross-links between your language slugs"

Opened this issue · 3 comments

Localization SEO is missing currently. It's even noted in the README file:

What is missing:

  • opinion how to localize slugs
    this highly depends where your data comes from. This will probably differ from project to project.
  • cross-links between your language slugs
    like mentioned above, this will differ from project to project. You can find useful resources here:\

This can be solved by implementing Link headers.

This is fairly simple to do, and can be done in the hooks.ts file. Here's what i came up with.

// src/hooks.ts
export const handle: Handle = async ({ event, resolve }) => {
    const response = await resolve(event);

    const [, lang, ...crumbs] = event.url.pathname.split('/')

    const host = event.request.headers.get("host");
    response.headers.set(
        "Link",
        locales.map(
            (locale: Locales) =>
                `<${[host, locale, crumbs].join("/")}>; rel="alternate"; hreflang="${locale}"`
        ).join(", ")
    );

    const body = await response.text();
    return new Response(body.replace('<html lang="en">', `<html lang="${lang}">`), response)
};

It works without a problem in my testing.

Hi @emreozcan, thanks for providing a snippet on how you handle this!

I know it is not that hard. It will work for exactly this example. But for a highly SEO-optimized site, also your slugs would change depending on the locale ('/en/company' => '/de/unternehmen'). And here the complexity begins.
To not provide a solution that would be wrong in those instances I didn't want to include it in the example. But after seeing you opening this issue, my mind has changed 😅. It will be probably fine to add this as an example and write a warning comment on top of the implementation.
I will add it to this example when I find some time.

Just two points I would change about your implementation:

  • You should also use the hreflang="x-default" to mark the default/fallback version of the site.
  • Even if it is supported to add the links via headers, I personally would set them inside the head-tag of the html. I'm not sure if other search engines beside google also support those headers.

Thank you very much for the kind words and the explanation.

Even though it is a hard task, it is possible to be solved. Let's split it into 2 different problems and solve each one.

  1. Resolving requests made to localized slugs.
  2. Locale switching. (Solving this also comes with interlingual links, free of charge 😄)

I have omitted the import/exports of the files I give as an example for the sake of brevity.
Also, I do not know German, so I'll be giving examples in Turkish but hear me out 😅

I will use the following i18n files in my examples:

/*************************
 * src/i18n/en/index.ts
 *************************/
const en: BaseTranslation = {
    slugs: {
        blog: {
            _index: "blog",
            list: "list",
        },
        anotherpage: "yet-another-page",
    }
}

And the Turkish locale,

/*************************
 * src/i18n/tr/index.ts
 *************************/
const tr: Translation = {
    slugs: {
        blog: {
            _index: "gunluk",
            list: "liste",
        },
        anotherpage: "bir-diger-sayfa",
    }
}

1. Resolving requests made to localized slugs.

I was able to solve this by "faking" the request event URL's path. Here's what I came up with:

/*************************
 * src/utils.ts
 *************************/

// ...
// eslint-disable-next-line @typescript-eslint/no-empty-interface
interface Slugs extends Record<string, Slugs | string> { }

export const findCanonicalPath = (localizedSlugs: Slugs, localizedCrumbs: Array<string>): Array<string> => {
    let leafPath: Array<string> = [];
    for (let i = 0; i < localizedCrumbs.length; i++) {
        // We can't use (const ... of ...) style `for` loop because
        // we need the current index for the potential fallback operation.
        const crumb = localizedCrumbs[i];

        // Traverse the slug tree up until the current path
        let leaf: Slugs | string = localizedSlugs;
        for (const key of leafPath) {
            leaf = (<Slugs>leaf)[key];
        }

        const leafKeys = Object.keys(leaf);
        let keyFound = false;
        for (const leafKey of leafKeys) {
            const innerLeaf = (<Slugs>leaf)[leafKey];
            if ((<string>innerLeaf) === crumb) {
                leafPath.push(leafKey);
                keyFound = true;
                break;
            } else if ((<Slugs>innerLeaf)["_index"] === crumb) {
                leafPath.push(leafKey);
                keyFound = true;
                break;
            }
        }
        // If a suitable match wasn't found, fallback to the original path.
        if (!keyFound) {
            leafPath = leafPath.concat(localizedCrumbs.slice(i));
            break;
        }
    }
    return leafPath;
}
// ...
/*************************
 * src/hooks.ts
 *************************/

// eslint-disable-next-line @typescript-eslint/no-empty-interface
interface Slugs extends Record<string, Slugs | string> {}

export const handle: Handle = async ({ event, resolve }) => {
    // eslint-disable-next-line prefer-const
    let [, lang, ...crumbs] = event.url.pathname.split('/');

    await loadLocaleAsync(<Locales>lang);
    // Internally redirect the request, converting the localized url path to the "canonical" i18n keys.
    const canonicalPath = findCanonicalPath(loadedLocales[<Locales>lang].slugs, crumbs);
    event.url.pathname = [lang, ...canonicalPath].join("/");

    let response = await resolve(event);


    const host = event.request.headers.get("host");
    response.headers.set(
        "Link",
        locales.map(
            (locale: Locales) =>
                `<${[host, locale, crumbs].join("/")}>; rel="alternate"; hreflang="${locale}"`
        ).join(", ")
    );

    response = new Response((await response.text()).replace('<html lang="en">', `<html lang="${lang}">`), response)

    return response;
};

Because we modify the request event's URL path, everything should
"just work", and everything does "just work" in my testing, but I
probably missed some edge cases.

Here's my testing:

  • /en/diary and /tr/gunluk resolved to the route /[lang]/blog
  • /en/diary/list and /tr/gunluk resolved to the route /[lang]/blog/list
  • /en/yet-another-page and /tr/bir-diger-sayfa resovled to the route /[lang]/anotherpage
  • If a suitable case isn't found, that particular crumb isn't touched. For example, /en/diary/latest and /tr/gunluk/en-son resolve to /en/blog/latest and /tr/blog/en-son respectively. (Which is the expected behavior.)
  • Continuing from the last one, anything that comes after an unknown slug isn't touched either. For example, /tr/gunluk/liste/4/8/ becomes /tr/blog/list/4/8. So route parameters etc. aren't broken! ✨✨

So, it is a solved problem in my opinion.

2. Locale switching.

Since we have a way to find the i18n keys of localized URLs, writing the inverse function of this is pretty trivial.

We need to replace the replaceLocaleInUrl = (path: string, locale: string): string function. I came up with this:

/*************************
 * src/utils.ts
 *************************/

// ...
export const localizePath = (toLocaleSlugs: Slugs, canonicalCrumbs: Array<string>): Array<string> => {
    let newPath: Array<string> = [];
    let pathLeaf = toLocaleSlugs;
    for (let index = 0; index < canonicalCrumbs.length; index++) {
        const canonicalCrumb = canonicalCrumbs[index];

        if (canonicalCrumb in pathLeaf) {
            const innerLeaf = pathLeaf[canonicalCrumb];
            if (typeof innerLeaf === "string") {
                newPath.push(innerLeaf);
                break;
            } else {
                newPath.push(<string>(<Slugs>innerLeaf)["_index"]);
                pathLeaf = innerLeaf;
                continue;
            }
        } else {
            newPath = newPath.concat(canonicalCrumbs.slice(index));
            break;
        }
    }
    return newPath;
}
// ...

Here's how I integrated it in the locale switcher:

/*************************
 * src/lib/components/LocaleSwitcher.svelte
 *************************/

// Inside the <script lang="ts"> tag:
const switchLocale = async (newLocale: Locales, updateHistoryState = true) => {
    if (!newLocale || $locale === newLocale) return
    const oldLocale = $locale;
    // load new dictionary from server
    await loadLocaleAsync(newLocale)
    // select locale
    setLocale(newLocale)
    // update `lang` attribute
    document.querySelector('html').setAttribute('lang', newLocale)
    const [, , ...crumbs] = location.pathname.split("/");
    if (updateHistoryState) {
        // update url to reflect locale changes
        history.pushState(
            { locale: newLocale },
            '',
            [
                newLocale,
                ...localizePath(
                    loadedLocales[newLocale].slugs,
                    findCanonicalPath(loadedLocales[oldLocale].slugs, crumbs)
                )
            ].join("/")
        )
    }
}

Of course, we need to update the src/routes/__layout.svelte redirector:

/*************************
 * src/lib/routes/__layout.svelte
 *************************/

// Inside the <script context="module" lang="ts"> tag, in the load function:

// redirect to base locale if language is not present
if (!locales.includes(lang)) {
    const [, , ...crumbs] = location.pathname.split("/");
    return {
        status: 302,
        redirect: [baseLocale, ...crumbs].join("/"),
    };
}

Conclusion

All of this certainly would have been very great, only if SvelteKit rendered the response using the url.pathname property of the event parameter of the resolve() callback function in the handle() hook.

Here's the definition of the resolve() function.
https://github.com/sveltejs/kit/blob/88f8682/packages/kit/src/runtime/server/index.js#L173-L290

It uses the render_page() internal function to render routes.
https://github.com/sveltejs/kit/blob/88f8682/packages/kit/src/runtime/server/index.js#L227
(And render_endpoint() for endpoints, but we'll just ignore that.)

As can be seen, the route passed into the render_page() is determined outside the resolve() function, even before the event object is constructed!
https://github.com/sveltejs/kit/blob/88f8682/packages/kit/src/runtime/server/index.js#L81

Basically, SvelteKit uses the original request URL, not the one we pass into the resolve() callback function. I think this is not the intended behaviour, as practically the resolve() function should be doing the rendering, which it is not doing.

We can maybe file an issue in the SvelteKit issue tracker, but until this gets resolved, it is not possible to do in SvelteKit.

What do you think?

Hi @emreozcan, thanks for your detailed response and code snippets that demonstrate how such a solution could look like.
The provided solution looks good. You should post your findings in the official Svelte issue that deals with i18n topics: sveltejs/kit#553. Maybe also some others will find it useful.

I'm sorry if it was unclear why I didn't inlcude a solution in this repo initially.
Using the translations file for the routes is one way of dooing it, But some other solutions may load all pages and content from a CMS and there a solution would look very different. There are certainly a lot of ways how to implement cross-links. But they all will only work for that specific use-case. Providing a one-solution-fits-all is hard so I didn't really care.

Maybe we should add some general utility functions into the main typesafe-i18n repo that everyone else can use to implement his own solution. And also some kind of a general step-by-step guide what is needed to implement thos cross-links.