Doc Version: 1.1.1-20200531
๐บ๐ธ English | ๐จ๐ณ ็ฎไฝไธญๆ
- Convert pre-defined User Defined Phrases(UDP) to supported format for Win10 Pinyin IME, macOS Pinyin IME (+iOS/iPadOS), QQPinyin. Also generate HTML and JSON file for further usage.
- Based on the
.json
file as the input, convert to other formats.
# Quick Start
python3 run_parser.py
|-- Phrasers/ # Parser classes for decode from target format to python dict and encode python dict to target format.
|-- phraser.py # Base class for all the phraser classes.
|-- jsonphraser.py # Parse `.json` file.
|-- tomlphraser.py # Parse `.toml` file.
|-- macphraser.py # Parse macOS `.plist` file.
|-- msphraser.py # Parse Win10 Pinyin IME `.dat` file.
|-- txtphraser.py # Parse QQPinyin `.ini` file.
|-- htmlpharser.py # Generate `.html` file.
|-- htmlphraser_tpl.py # Template for `.html` file generation.
|-- Phrases/ # User Defined Phrases in JSON format, as the input to conversions.
|-- UDP-*.json # User Defined Phrases in JSON format.
|-- UDP-*.toml # User Defined Phrases in TOML format.
|-- GeneratedUDP/ # This Folder holds the generated files. You can delete these files any time, they are not important.
|-- user_defined_phraser.py # Main entry of program. Convert `.json` or `.toml` files to other formats.
- All Python Dict and JSON format is:
{ 'phrase': "<PHRASE>", 'shortcut': "<SHORTCUT>" }
*Phraser
classes includeto_file()
,from_file()
,to_format*()
,from_format*()
functions. They are used for read/write files and read/write formatted strings.
- System Settings โ Time and Languages โ Region and Languages โ Chinese โ Preferences โ Microsoft Pinyin โ Preferences
- Lexicon and self-learning โ Add or Edit User Defined Phrases โ Clear
- System Settings โ Time and Languages โ Region and Languages โ Chinese โ Preferences โ Microsoft Pinyin โ Preferences
- Lexicon and self-learning โ Add or Edit User Defined Phrases โ Import
UserDefinedPhrase.dat
- File suffix:
.dat
or.lex
. - Use
mschxudp
for formatting. Update with system update.
# win10 1703
# proto8 unknown_X version
# 00000000 6d 73 63 68 78 75 64 70 02 00 60 00 01 00 00 00 |mschxudp..`.....|
# phrase_offset_start
# phrase_start phrase_end phrase_count
# 00000010 40 00 00 00 48 00 00 00 98 00 00 00 02 00 00 00 |@...H...........|
# timestamp
# 00000020 49 4e 06 59 00 00 00 00 00 00 00 00 00 00 00 00 |IN.Y............|
# 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
# candidate2
# phrase_offsets[] magic_X phrase_offset2
# 00000040 00 00 00 00 24 00 00 00 10 00 10 00 18 00 06 06 |....$...........|
# phrase_unknown8_X pinyin
# 00000050 00 00 00 00 96 0a 99 20 61 00 61 00 61 00 00 00 |....... a.a.a...|
# phrase magic_X
# 00000060 61 00 61 00 61 00 61 00 61 00 00 00 10 00 10 00 |a.a.a.a.a.......|
# phrase_unknown8_X
# candidate2
# offset2 pinyin
# 00000070 1a 00 07 06 00 00 00 00 a6 0a 99 20 62 00 62 00 |........... b.b.|
# phrase
# 00000080 62 00 62 00 00 00 62 00 62 00 62 00 62 00 62 00 |b.b...b.b.b.b.b.|
# 00000090 62 00 62 00 62 00 00 00 |b.b.b...|
proto8
:'mschxudp'
phrase_offset_start + 4 * phrase_count == phrase_start
phrase_start + phrase_offsets[N] == magic(0x00080008)
pinyin&phrase
: utf16-le stringhanzi_offset = 8 + len(pinyin)
phrase_offsets[N] + offset + len(phrase) == phrase_offsets[N+1]
candidate2
: 1st byte represent the phrase position
- System Preferences โ Keyboard โ Text
- Select any, โA, click - or โซ/delete
- System Preferences โ Keyboard โ Text
- Drag
*.plist
into the window (one by one).
- Existing phrases will not duplicated, it's smart.
.plist
file withxml
format.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"><?xml version="1.0" ?>
<plist version="1.0">
<array>
<dict>
<key>phrase</key>
<string>[word]</string>
<key>shortcut</key>
<string>[spell]</string>
</dict>
<dict>
<key>phrase</key>
<string>[word]</string>
<key>shortcut</key>
<string>[spell]</string>
</dict>
</array>
</plist>
- QQPinyin โ Settings โ Lexicon โ User Defined Phrases::Settings
- Multi-select: hold Ctrl + Click, one by one.
- Click delete.
- QQPinyin โ Settings โ Lexicon โ User Defined Phrases::Settings
- Click "Import", select
*.txt
file.
.txt
format
[spell]=[position],[word]
[spell]=[position],[word]