/unicode-precis

Preparation, Enforcement, and Comparison of Internationalized Strings in Application Protocols

Primary LanguageRakuArtistic License 2.0Artistic-2.0

Build Status AppVeyor Build Status License

PRECIS Framework: Preparation, Enforcement, and Comparison of Internationalized Strings in Application Protocols

Many tests are based on the Unicode® database as well as the unicode tools from perl6. Not all methods and functions are in place e.g. uniprop() is not yet available in the jvm. Also perl6 seems to be based on Unicode version 8.0.0 but is scheduled for 9.0.0. However parts are working in version 9.0.0 now. Not available in jvm yet are uniprop, uniprop-bool, uniprop-int, uniprop-str.

Synopsis

use Unicode::PRECIS;
use Unicode::PRECIS::Identifier::UsernameCasePreserved;

my Unicode::PRECIS::Identifier::UsernameCaseMapped $uname-profile .= new;

my Str $username = "نجمة-الصباح";
my TestValue $tv = $uname-profile.enforce($username);
if $tv ~~ Str {
  say "Username $username accepted but converted to $tv";
}

elsif $tv ~~ Bool {
  say "Username not accepted";
}

RFC's and program documentation

Module documentation

Base information for the modules

I've started to study rfc4013 for SASLprep. Then recognized it was a profile based on Stringprep specified in rfc3454. Both are obsoleted by rfc7613 and rfc7564 resp because they are tied to Unicode version 3.2. The newer rfc's are specified to be free of any Unicode version.

Further needed information

From unicode.org

Perl 6

Perl 6 uses graphemes as a base for the Str string type. These are the visible entities which show as a single symbol and are counted as such with the Str.chars method. From this, normal forms can be generated using the string methods uniname, uninames, unival, univals, NFC, NFD, NFKC and NFKD. Furthermore the strings can be encoded to utf-8.

Versions of perl, moarvm

This project is tested with latest Rakudo built on MoarVM implementing Perl v6.c.

Implementation track

First the basis of the PRECIS framework will be build. As soon as possible a profile for usernames and passwords follows. This is my first need. When this functions well enough, other profiles can be inserted. Much of it is now Implemented.

Naming of modules;

  • Unicode::PRECIS using rfc7564
  • Unicode::PRECIS::Identifier using rfc7564
  • Unicode::PRECIS::Identifier::UsernameCaseMapped using rfc7613
  • Unicode::PRECIS::Identifier::UsernameCasePreserved using rfc7613
  • Unicode::PRECIS::Freeform using rfc7564
  • Unicode::PRECIS::Freeform::OpaqueString using rfc7613

Authors

Marcel Timmerman translation of the modules for perl 6

Contact

MARTIMM on github