Not supporting en_001, en_150 or en_US_POSIX locale
mattjgalloway opened this issue · 3 comments
I'm starting to use Boost.Locale in a project and I've hit up against a problem when the system locale is en_001. And I believe it will also be a problem with en_150 and en_US_POSIX as well.
The problem is best described with the following test case:
#include <boost/locale.hpp>
#include <iostream>
int main(int argc, char** argv) {
boost::locale::generator gen;
std::locale loc(gen(""));
std::locale::global(loc);
std::cout.imbue(loc);
std::cout << "LOCALE NAME: " << std::use_facet<boost::locale::info>(loc).name() << std::endl;
std::cout << "LOCALE LANG: " << std::use_facet<boost::locale::info>(loc).language() << std::endl;
std::cout << "LOCALE COUNTRY: " << std::use_facet<boost::locale::info>(loc).country()
<< std::endl;
std::cout << "LOCALE ENCODING: " << std::use_facet<boost::locale::info>(loc).encoding()
<< std::endl;
std::cout << "LOCALE UTF8: " << std::use_facet<boost::locale::info>(loc).utf8() << std::endl;
return 0;
}
If the system is in en_001, which on Windows is the "English (World)" region name, then the following will be output:
LOCALE NAME: en_001.UTF-8
LOCALE LANG: en
LOCALE COUNTRY:
LOCALE ENCODING: us-ascii
LOCALE UTF8: 0
I would expect the output to be:
LOCALE NAME: en_001.UTF-8
LOCALE LANG: en
LOCALE COUNTRY: 001
LOCALE ENCODING: utf-8
LOCALE UTF8: 1
It's coming from the fact that in boost::locale::util::locale_data::parse_from_country
, we are assuming the country needs to contain only 'a' to 'z' or 'A' to 'Z'. But en_001
(and en_150
) are valid locales. Probably en_US_POSIX
should be handled separately as it's special.
Hi @mattjgalloway . I'm currently working on getting your PR fixing this into the next release (no worries about the conflicts, I'll resolve those)
I was wondering whether you had any expectation on how en_US_POSIX
is handled?
As far as I've understood this is basically the "C"
locale in C++, aka "POSIX"
. So I think it makes sense to treat it as an alias so the output would be:
LOCALE NAME: C
LOCALE LANG:
LOCALE COUNTRY:
LOCALE ENCODING:
LOCALE UTF8: 0
You could run into this on Linux or when using boost::locale::generator("en_US_POSIX")
as the WinAPI will not return that as the "system locale".
@Flamefire Thanks for coming back on this!
Yes I think that would be right for en_US_POSIX
.
Thanks @Flamefire for getting this merged!