Language Tags
TimidRobot opened this issue · 1 comments
1) Use IETF BCP 47 language tags instead of ISO 629-2 language codes
The documentation recommends ISO 629-2 language codes:
contributor_covenant/README.md
Lines 60 to 61 in b6b8445
However, I believe those are technically insufficient. Instead I recommend using a IETF BCP 47 language tag. Thankfully, it is based on ISO 629-2 (no changes necessary). It also provides additional information, when needed. For example, If there is translation into Serbian (ISO 629-2 language code sr
, you need to specify whether the Latin or Cyrillic is used--sr-latn
or sr-cyrl
)
IETF language tag - Wikipedia:
To distinguish language variants for countries, regions, or writing systems (scripts), IETF language tags combine subtags from other standards such as ISO 639, ISO 15924, ISO 3166-1 and UN M.49.
RFC 5646 - Tags for Identifying Languages provides a public specification.
2) Documentation leave case ambiguous
The configuration file (config.toml
(permalink)) currently only contains lowercase language codes with the exception of: fa-IR
فارسی (ایران) [Persian (Iran)]. To prevent confusion and unnecessary redirects, I recommend explicitly stating that lowercase language tags should be used.
3) Region vs Script
(I have the least confidence in this last recommendation.) It is my understanding that script codes better serve the global community than region codes (ex. ➡️ zh-cn
zh-hans
and ➡️ zh-tw
zh-hant
).
Additional context for 3) Region vs Script: #18419 (Language code is not correct for Chinese) – Django