github/cmark-gfm

Multi language page in GFM

shawnthompson opened this issue · 16 comments

Is there a way to markup when creating a multi language page in GFM?

The Government of Canada is working a lot in the open these days but we have to supply all our documentation in both English and French.

See our gc-da11yn.github.io/README.md for an example.

We also have to make all content accessible to all users but if they use a screen reader to read the French text on our markdown pages in the proper language.

I guess we could put a <div lang="fr"> around the text but it would be nice to have a way to do it in markdown.

Ideally we would like to see this change go as far up the markdown chain as possible, maybe I should open the issue in the cmark repo.

Thanks for all the great work you all do!

Shawn Thompson, WAS
Web Accessibility Technical Advisor

I'm thinking this is something that just needs to be done in HTML like you mentioned (<div lang="fr">).

Just curious what type of syntax would you like to see?

Not sure what type of syntax.

I like the : in the [links] use but how would you use that with blocks of text?

This could be a problem if or when the Github interface starts supporting other languages (again) for their interface. A lot of .md files are in English. If the <html> for the site is set in another language, the markdown will inherit that language and not be properly read.

Please keep in mind it's not only non-sighted users who use a reader. Many other people with different disabilities can benefit from using a reader.

I think if you are looking for accessibility, markdown is the wrong format. There are no aria attributes in the markdown spec and the markdown spec moves much slower than the html spec so any new accessibility attributes won't be adopted before the html spec.

Luckily all html is valid markdown.

I completely agree with you but the lang attribute has been a valid HTML component for quite some time now.

I wouldn't want to put aria in markdown.

lang - HTML: HyperText Markup Language - MDN Web Docs

Extensions to markdown are generally not discussed here btw, but in CommonMark. GFM extends CommonMark. GH isn’t actively adding things to GFM.
But, having some experience with CM/GFM, I don’t see this changing. Markdown is mostly frozen.

I figured as much but I was hoping to get more support from GFM. I saw some talk about supporting Right to Left (RTL) languages on the CM issues but nothing on actually language of text.

Maybe I should move this thread to the commonMark repo.

I found this thread on Abbreviations (and acronyms), which is similar to what I'm asking.

I don't think it was ever implemented so as @wooorm said, Markdown seems frozen.

Added a discussion on CommonMarkdown's board:

Adding lang=“lang” syntax

Github should absolutely be able to parse a BCP-47 code (e.g. .fr) when used as a supplementary language-specifier file extension before the type-specifier file extension (usually .md). This meta information should then be used in the generated HTML.

That's interesting indeed. We can use language specific .md files instead of doing multiple languages on the page. Are you saying this is available now?

It's not supported now. It could be cool though.

Might be somewhat complex though for GH, to work on all markdown (is an 'es.md' spanish? 'this.is.it.md' Italian?)

But you can support that on your site?

I'm more interested on the Github web interface. We are using 11ty to build are site, it can easily be done with that.

The lang has specific value based on the BCP47 language code.

I'm thinking what could happen is that if the file name would be README.fr.md GH could add the lang="fr" to the existing <article class="markdown-body entry-content container-lg" itemprop="text" lang="fr"> element it creates.

In our organization we can promote using .lang.md files instead of doing bilingual README.md files. Stick to using HTML as a last resort in the content.

Yes, this is exactly what I meant. Smart i18n support for Readme files isn’t there yet, but it absolutely should be.

@shawnthompson

Github should absolutely be able to parse a BCP-47 code (e.g. .fr) when used as a supplementary language-specifier file extension before the type-specifier file extension (usually .md).

In our organization we can promote using .lang.md files instead of doing bilingual README.md files.

I suggested this in my reply a few days ago. As I mentioned, a lot of publishing tools support it. I never got a clear answer as to what you were trying to do, but from your comment above, I know now it's the GitHub web interface that matters. You have no choice, then, but to use <div lang="fr">, which to be honest is such a simply thing it wouldn't make sense to wait until (and if) GitHub adds some other mechanism. As I mentioned, <div lang="fr"> is in no way a hack.

It would be nice if GitHub does add support to file/directory name base language indicators. It can do it in a backward compatible way via settings in .github or other dot file customization. Even better is if a standard were established for "assemblies" of structured text content, something I started to work on but had to back burner. I'm willing to pick it back up, and even take on collaborators, if there is interest.

@vassudanagunta I completely agree with you that using <div lang="fr"> is not a hack and that's the beauty of Markdown, being able to use html elements inside it. We are going to adding the div to our .md files for now. Hopefully in the future there will be another way to standardize the language of a .md file.

I took a look at your project. It's an interesting idea, not sure if I would be able to contribute but I'll definitely pass it around to some colleagues who work in that field.

Thanks again.

@shawnthompson I'd appreciate that. I have much more than what's visible in that placeholder repo. If someone is interested, I can move it back onto a front burner and publish more.

Right now I'm developing a way for people to extend Markdown, RST or even define their own structured plaintext formats without writing any code. It's called Plain Text Style Sheets.