ruby-i18n/ruby-cldr

Feature request: A way to get the list of locales

Closed this issue · 2 comments

It seems that ruby-cldr figures out the list of locales by iterating over the filenames:

Dir["#{dir}/main/*.xml"].map { |path| path =~ /([\w_-]+)\.xml/ && $1 }

This is fine.

However, users of ruby-cldr cannot do this, since the exported directory also contains non-locale directories (ex.transforms)

Create a mechanism that allows users to ruby-cldr to reliably get the list of locales. Perhaps this could be as simple as generating a locales.yml file from Cldr::Export::Data#locales.

This would allow ruby-cldr to do whatever it would like with the file structure, and give confidence to users that they aren't accidentally including non-locales in their list of regions.


Is there an officially supported way to get the list of locales?

Hey @movermeyer,

I did a little bit of digging into the CLDR data and couldn't find exactly what you're looking for. I have a hunch that's because locales can be represented in a few different ways. At their most specific, locales specify language, script, and region. For example, English spoken in the US can be represented by en, but the full version is really en-Latn-US. The CLDR data set contains both en and en-US (en is what's known as a parent locale of en-US), but omits the script; really, the script is inferred because US English is only ever written in the Latin alphabet.

In other words, CLDR contains support for all three of these locales: en, en-US, and en-Latn-US. So the question is, should ruby-cldr export a resource that lists all of these locales? I'm not sure that's useful, and it would yield a very long list.

You'd probably be better off determining if a certain locale exists algorithmically by combining parent locale data and likely subtags. That's essentially what TwitterCLDR does:

TwitterCldr::Shared::Locale.parse('en').maximize.to_s  # => "en-Latn-US"

Since only a subset of CLDR locales are supported in TwitterCLDR, there's a mechanism for grabbing a supported locale too:

TwitterCldr::Shared::Locale.parse('en').max_supported.to_s  # => en-US

So the question is, should ruby-cldr export a resource that lists all of these locales? I'm not sure that's useful, and it would yield a very long list.

That's actually what I was looking for. I'm looking for a way of knowing all the locales (for example, as output by Cldr::Export::Data#locales), without having to parse the directory names and then exclude the non-locale directories like transforms.

I'm trying to avoid the messiness/fragility of parsing file-system names and hard-coding directories to ignore.