/rfc5051

Haskell implementation of RFC5051, simple unicode collation.

Primary LanguageHaskellBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

rfc5051 - Simple unicode collation

This library implements the simple, non locale-sensitive unicode collation algorithm described in RFC 5051. Proper unicode collation can be done using text-icu, but that is a big dependency that depends on a large C library. rfc5051 might be better for some purposes.

Here is a list of lines sorted by the Prelude's sort function:

Abe Oeb abe abé oeb Ábe Äbe Ôeb ábe äbe ôeb

Here is the same list sorted by sortBy compareUnicode:

Abe abe abé Ábe ábe Äbe äbe Oeb oeb Ôeb ôeb

The library's data module, Data.RFC5051.UnicodeData, is generated from the data file UnicodeData.txt. To regenerate it, use the Makefile or:

runghc MkUnicodeData.hs > src/Data/RFC5051/UnicodeData.hs