[BUG] I18n.l returns incorrectly encoded strings
Opened this issue · 0 comments
skonotopovMarketer commented
What I tried to do
I18n.locale = 'nb-NO'
I18n.t('date.day_names')
#=> ["søndag", "mandag", "tirsdag", "onsdag", "torsdag", "fredag", "lørdag"]
I18n.t('date.day_names')[0].codepoints
#=> [115, 248, 110, 100, 97, 103] #=> 248 is the correct codepoint for "ø"
date_time = DateTime.parse('2024-06-15T15:00:00Z')
#=> Sat, 15 Jun 2024 15:00:00 +0000
time_with_zone = date_time.in_time_zone('Europe/Oslo')
#=> Sat, 15 Jun 2024 17:00:00.000000000 CEST +02:00
time_with_zone.class
#=> ActiveSupport::TimeWithZone
localized_with_date_time = I18n.l(date_time, format: :calendar_list_weekday)
#=> "lørdag"
localized_with_date_time.codepoints
#=> [108, 248, 114, 100, 97, 103] #=> correct
localized_with_time_with_zone = I18n.l(time_with_zone, format: :calendar_list_weekday)
#=> "lørdag"
localized_with_time_with_zone.codepoints
#=> [108, 195, 184, 114, 100, 97, 103] # incorrect; 195 and 184 are à and ̧, respectively
localized_with_time_with_zone.encoding
#=> #<Encoding:UTF-8>
# workaround:
localized_with_time_with_zone.force_encoding('UTF-8').codepoints
#=> [108, 248, 114, 100, 97, 103]
What I expected to happen
I18n.l
used with ActiveSupport::TimeWithZone
returning the string with the codepoints exactly as in the locale file.
What actually happened
I18n.l
returned the string where the character code is split in two codepoints, 0xC3 and 0xB8, respectively. This may lead to weird behavior when the resulting text is used elsewhere.
Versions of i18n, rails, and anything else you think is necessary
I18n::VERSION
#=> "1.14.5"