boostorg/locale

Failing tests

Flamefire opened this issue · 5 comments

With ./b2 toolset=gcc-10 libs/locale/test cxxstd=11 -j4 variant=debug I see many failing tests with recent ICU

One of those is "USD 1,345.00" instead of "USD1,345.00", other look more confusing.
See https://github.com/Flamefire/locale/runs/3748013166?check_suite_focus=true or https://github.com/Flamefire/locale/actions in general

@artyom-beilis As you wrote the tests I'm not sure if the tests are wrong or the results.

I need to test myself see what happens.

The test is problematic when ICU changes localization. Somethimes it is ICU bug. I'll check

What is ICU version? I can't find in the logs

On my machine:

/** The current ICU library version as a dotted-decimal string. The patchlevel
 *  only appears in this string if it non-zero.
 *  This value will change in the subsequent releases of ICU
 *  @stable ICU 2.4
 */
#define U_ICU_VERSION "66.1"

Ok, in recent ICU versions (71.1 for me) the word-break after e.g. "Windows7" gets assigned UBRK_WORD_NUMBER only, not number&word as the test expects. The only thing related to this what I found is tc39/proposal-intl-segmenter#80 (comment) where others also observed that any word ending in a number will be classified as a number.

Not sure if that has always been the case. Do you have an ICU version where test_boundary succeeds?

I found a reason why test_formatting is failing sometimes after some refactoring:

case 'c': // DateTile Full
{
if(cache)
return cache->date_time_format(format_len::Medium, format_len::Medium);
return strftime_to_icu_full(
icu::DateFormat::createDateTimeInstance(icu::DateFormat::kFull,icu::DateFormat::kFull,locale),
"yyyy-MM-dd HH:mm:ss"
);
}

I.e. if cache is given, then it used the "medium" length, otherwise the "full" length, see

icu::DateFormat::kMedium,
Also the fallback "yyyy-MM-dd HH:mm:ss" doesn't look like "full" (not sure what that actually is though)

And last but not least I found only 1 callsite of strftime_to_icu_symbol and that doesn't pass the cache hence it is always unset. So it looks like the cache was forgotten to be passed down after explicitly checking it at

{
if( !cache.date_format_[1].isEmpty()
&& !cache.time_format_[1].isEmpty()
&& !cache.date_time_format_[1][1].isEmpty())
{
icu_std_converter<CharType> cvt_(encoding);
std::basic_string<CharType> const &f=info.date_time_pattern<CharType>();
pattern = strftime_to_icu(cvt_.icu(f.c_str(),f.c_str()+f.size()),locale);
}
}

@artyom-beilis Do you remember the intended semantics? I struggle to make sense out of that... Especially the check for (only!) one of the cache entries above. What's its purpose?

I'm referring to failures e.g. seen in https://github.com/boostorg/locale/actions/runs/3074715644/jobs/4967667878

Test: %c
Error in line 452: 
---- [1970-02-05 15:33:13] != [Thursday, February 5, 1970 at 3:33:13 PM Greenwich Mean Time]
Test: %x
Error in line 452: 
---- [1970-02-05] != [Feb 5, 1970]
Test: %X
Error in line 452: 
---- [15:33:13] != [3:33:13 PM]
Test: %c
Error in line 452: 
---- [1970-02-05 15:33:13] != [Thursday, February 5, 1970 at 3:33:13 PM Greenwich Mean Time]
Test: %x
Error in line 452: 
---- [1970-02-05] != [Feb 5, 1970]
Test: %X
Error in line 452: 
---- [15:33:13] != [3:33:13 PM]