Wrong transforms for Cyrillic letters
serggi opened this issue · 2 comments
Hi, thanks for a bundle.
Can you help me understand why Cyrillic (I've checked ukrainian ans russian) letters don't transform like here
https://github.com/unicode-org/cldr/blob/master/common/transforms/Ukrainian-Latin-BGN.xml
$slugGenerator->generate('щ', ['locale' => 'uk'])
gives me s
instead of shch
########################################################################
#
# BGN Page 94 Rule 3.6
#
# шч becomes sh·ch
#
########################################################################
#
ШЧ → SH·CH ; # CYRILLIC CAPITAL LETTER SHA
Шч → Sh·ch ; # CYRILLIC CAPITAL LETTER SHA
шч → sh·ch ; # CYRILLIC SMALL LETTER SHA
Ш} $lower → Sh ; # CYRILLIC CAPITAL LETTER SHA
Ш → SH ; # CYRILLIC CAPITAL LETTER SHA
ш → sh ; # CYRILLIC SMALL LETTER SHA
Щ} $lower → Shch ; # CYRILLIC CAPITAL LETTER SHCHA
Щ → SHCH ; # CYRILLIC CAPITAL LETTER SHCHA
щ → shch ; # CYRILLIC SMALL LETTER SHCHA
Hi @serggi, congrats to your first GitHub issue 🎉🚀
It looks like the SlugGenerator
doesn’t find the correct transform for the uk
locale. It uses the rule uk-Latn
instead of uk-uk_Latn/BGN
.
Probably we should add
$locale.'-'.$locale.'_'.$rule,
\Locale::getPrimaryLanguage($locale).'-'.\Locale::getPrimaryLanguage($locale).'_'.$rule,
to
slug-generator/src/SlugGenerator.php
Lines 287 to 289 in 87a661a
Or it might be better to handle all script rules like Latn
correctly by parsing the locale and setting the script via Intl’s Locale
class.
I’m not sure if we can use the /BGN
variant for all languages, we have to test the impact of adding it.
Until this issue is fixed by the library itself you can use the preTransform
option in your project like so:
$slugGenerator->generate('щ', ['locale' => 'uk', 'preTransforms' => ['uk-uk_Latn/BGN']]);
Thanks, Martin for your help and your advice. It works well.