open-i18n/rust-unic

Upgrade to Unicode 11.0

Opened this issue · 1 comments

Description

Update external data and all modules to Unicode 11.0. For changes in Unicode 11.0, see: https://www.unicode.org/versions/Unicode11.0.0/

Tasks

These tasks are roughly planned out according to changes in Unicode 11.0 and the potential impact of rust-unic's implementation.

Segmentation

  • Implement new grapheme cluster breaking rules
    • Implement the new Extended_Pictographic property
    • Remove GB10 implementation
    • Implement the new GB11 rule
  • Implement new word breaking rules

Emoji

  • Add .rsv table for the new Extended_Pictographic property (other works are captured by UCD)

UCD

  • [Optional] Implement the new Equivalent_Unified_Ideograph property

IDNA

  • Update IDNA conformance test to the new format

Related Issues, Pull Requests, Forks

unicode-rs/unicode-segmentation#43 is tracking segmentation algorithm updates. I think we can keep the code in sync with unicode-segmentation, regardless of where we do the impl first.