twitter/twitter-cldr-rb

TwitterCldr doesn't recognize lowercase locales like 'en-gb'

KL-7 opened this issue · 4 comments

KL-7 commented

TwitterCldr expects 'en-gb' and other locales that include region name to be capitalized as 'en-GB'. It's not always convenient and converting lowercase or uppercase strings into this format is a bit tricky. I think we should convert locale names in our resources to lowercase so we can accept locale names with any capitalization.

Agreed :) I wonder if this hasn't bitten us much yet because of Mac OS X's infuriating case-insensitive file system.

KL-7 commented

Oh, right, that explains why I was getting different errors locally and on the server. On the server it was failing while trying to find the resource file and locally it was failing because it couldn't find any data in the file using the lowercase key.

KL-7 commented

Alright, I did a little bit of digging in twitter-cldr and ruby-cldr and here are my thoughts. Initially I was going to convert all resources and everything else in twitter-cldr to lower-case locales and then downcase locales that we get from users, but I no longer think that's the best approach.

First, even though the standard doesn't require it, I looks like most commonly region codes is written in upper case, so that should be the default expectation.

Second, going full lower-case makes it tricky to retrieve data from ruby-cldr on a case-sensitive filesystem, because all directories/files in the original CLDR data are using upper case region names. E.g., if you ask ruby-cldr to export resources for "en-gb", it'll find them only on case-insensitive filesystem and I'd rather not make that assumption. Besides, although this part is trivial, it requires an additional option in ruby-cldr to downcase all locales during the export (in directory names, hash keys, etc.).

With all that said, I still want to make twitter-cldr user-friendly in terms of locale casing, so I'm going to add a small 'hack' to lib/twitter_cldr.rb that will map lower-case locale names to standard CLDR names similar to how we map "twitter" locale names to CLDR locale names right now. It'll add a bit more work to our TwitterCldr.conver_locale method, but it will be a small local change in twitter-cldr rather than a whole bunch of changes in both libraries and in all our locale resources.

Does it make sense?

Yes, that makes perfect sense, thanks for writing up such a thorough explanation of the problem, @KL-7. It seems much less error-prone to handle casing logic in convert_locale than in a bunch of other, potentially case-insensitive places (i.e. the filesystem).