This projects has two parts. One is a library, providing access to some of the data contained in the Unicode Character Database by the means of a portable .NET assembly. The other, is a small WPF application allowing to inspect the Unicode code points composing a specified text.
Unicode 10.0.0 Emoji 5.0
Simply launch the application, then type or paste some text in the text box on the top of the window.
The code points will be displayed in the list on the left side. Select one of them to display the associated information in the bottom-right pane.

UnicodeRadicalStrokeCount.StrokeCount is now of type System.SByte instead of type System.Byte.
Grab the latest version of the package on NuGet: https://www.nuget.org/packages/UnicodeInformation/. Once the library is installed in your project, you will find everything you need in the System.Unicode namespace.
Let's see a simple example:
using System;
using System.Text;
using System.Unicode;
namespace Example
{
internal static class Program
{
private static void Main()
{
Console.OutputEncoding = Encoding.Unicode;
PrintCodePointInfo('A');
PrintCodePointInfo('∞');
PrintCodePointInfo(0x1F600);
}
private static void PrintCodePointInfo(int codePoint)
{
var charInfo = UnicodeInfo.GetCharInfo(codePoint);
Console.WriteLine(UnicodeInfo.GetDisplayText(charInfo));
Console.WriteLine("U+" + codePoint.ToString("X4"));
Console.WriteLine(charInfo.Name ?? charInfo.OldName);
Console.WriteLine(charInfo.Category);
}
}
}This example shows a few usages of the library. It gets information on a specific code point, queries the library for the text to display for the specific character (usually the character itself), and displays the character's name and category.
In its current state, the project is written in C# 6, compilable by Roslyn, and targets the .NET 4.5 framework. The core of the project, UnicodeInformation.dll, is a portable class library usable for either regular .NET or Windows 8 applications. This library includes a subset of the official Unicode Character Database stored in a custom file format.
- Name
- General_Category
- Canonical_Combining_Class
- Bidi_Class
- Decomposition_Type
- Decomposition_Mapping
- Numeric_Type (See also kAccountingNumeric/kOtherNumeric/kPrimaryNumeric. Those will set Numeric_Type to Numeric.)
- Numeric_Value
- Bidi_Mirrored
- Unicode_1_Name
- Simple_Uppercase_Maping
- Simple_Lowercase_Mapping
- Simple_Titlecase_Mapping
- Name_Alias
- Block
- ASCII_Hex_Digit
- Bidi_Control
- Dash
- Deprecated
- Diacritic
- Extender
- Hex_Digit
- Hyphen
- Ideographic
- IDS_Binary_Operator
- IDS_Trinary_Operator
- Join_Control
- Logical_Order_Exception
- Noncharacter_Code_Point
- Other_Alphabetic
- Other_Default_Ignorable_Code_Point
- Other_Grapheme_Extend
- Other_ID_Continue
- Other_ID_Start
- Other_Lowercase
- Other_Math
- Other_Uppercase
- Pattern_Syntax
- Pattern_White_Space
- Quotation_Mark
- Radical
- Soft_Dotted
- STerm
- Terminal_Punctuation
- Unified_Ideograph
- Variation_Selector
- White_Space
- Lowercase
- Uppercase
- Cased
- Case_Ignorable
- Changes_When_Lowercased
- Changes_When_Uppercased
- Changes_When_Titlecased
- Changes_When_Casefolded
- Changes_When_Casemapped
- Alphabetic
- Default_Ignorable_Code_Point
- Grapheme_Base
- Grapheme_Extend
- Grapheme_Link
- Math
- ID_Start
- ID_Continue
- XID_Start
- XID_Continue
- Unicode_Radical_Stroke (This is actually kRSUnicode from the Unihan database)
- Code point cross references extracted from NamesList.txt
NB: The UCD property ISO_Comment will never be included since this one is empty in all new Unicode versions.
- kAccountingNumeric
- kOtherNumeric
- kPrimaryNumeric
- kRSUnicode
- kDefinition
- kMandarin
- kCantonese
- kJapaneseKun
- kJapaneseOn
- kKorean
- kHangul
- kVietnamese
- kSimplifiedVariant
- kTraditionalVariant
The project UnicodeInformation.Builder takes cares of generating a file named ucd.dat. This file contains Unicode data compressed by .NET's deflate algorithm, and should be included in UnicodeInformation.dll at compilation.