tc39/ecma402

Intl.DateTimeFormat does not support 'und' locale

Opened this issue · 10 comments

This seems wrong. Apparently 'und' falls back to 'en', which is different behavior than ICU4C.

Examples:

Welcome to Node.js v18.19.1.
Type ".help" for more information.
> dt = new Intl.DateTimeFormat('und', {"month":"short","weekday":"narrow","day":"numeric","calendar":"gregory","numberingSystem":"latn"})
DateTimeFormat [Intl.DateTimeFormat] {}
> dt.format()
'T, Apr 30'

> Intl.DateTimeFormat.supportedLocalesOf(["und"])
[]
> Intl.DateTimeFormat.supportedLocalesOf(["und", "en"])
[ 'en' ]
 
sffc commented

I thought that "und" was supported in engines, but I guess not?

CC @anba @FrankYFTang @gibson042 @eemeli

und is not supported in browsers. Supporting it would probably fix some of the use cases of the Stable Formatting proposal, but not all.

The reason is very simple. there are no locale resources defined for "und".
See
https://github.com/unicode-org/cldr/blob/main/common/main/und.xml
is a 404

Also ref https://tc39.es/ecma402/#available-locales-list

sffc commented

The resources for "und" are stored in root.xml in CLDR.

In v8, internally we call

uloc_openAvailableByType(ULOC_AVAILABLE_WITH_LEGACY_ALIASES, &status);

to find out what locales are available. neither "und" nor "root" is enumerated

https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uloc_8h.html#a1d61e1cb6a0d2ad60dc3cd78c931e551
said
"Gets a list of available locales according to the type argument, allowing the user to access different sets of supported locales in ICU."

if "und" and "root" are not reported by ICU as "available locales", then v8 will not treat them as supported.

sorry, I hit closed by accident.

sffc commented

I made an upstream issue: https://unicode-org.atlassian.net/browse/ICU-22766

Whether or not ICU decides to start including the root locale in the return value of uloc_openAvailableByType, I think Web engines could decide to include that locale in their own lists of supported locales.

sffc commented

TG2 discussion: https://github.com/tc39/ecma402/blob/main/meetings/notes-2024-08-22.md#intldatetimeformat-does-not-support-und-locale-885

An interesting but potentially unexpected outcome of the discussion was the realization that "und" is defined by BCP-47 as simply an absent locale, so it is not semantically incorrect for ECMA-402 to have the current web reality behavior of making "und" basically an alias for undefined.

We want a way to actually get root behavior, but this might be better handled by the null locale proposal (Stable Formatting).

In v8, internally we call


uloc_openAvailableByType(ULOC_AVAILABLE_WITH_LEGACY_ALIASES, &status);

to find out what locales are available. neither "und" nor "root" is enumerated

https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uloc_8h.html#a1d61e1cb6a0d2ad60dc3cd78c931e551

said

"Gets a list of available locales according to the type argument, allowing the user to access different sets of supported locales in ICU."

if "und" and "root" are not reported by ICU as "available locales", then v8 will not treat them as supported.

This is not correct. Root is structurally required. Available locales is the list to show to users. If icu docs don't make that clear it should be filed upstream.

V8 is wrong to filter on the available list and not include root. The better way would be to actually query icu for the locales actual status.

Internally root is included in the manifest for the locales. I don't remember, it's possible root is simply excluded here.

sffc commented

I don't think we should change the Web Reality behavior until TG2 has reached a consensus on this issue, so I don't want V8 or other engines to start doing something different with "und" in the mean time.