/dragonmapper

Identification and conversion functions for Chinese text processing

Primary LanguagePythonMIT LicenseMIT

Dragon Mapper

https://badge.fury.io/py/dragonmapper.png https://travis-ci.org/tsroten/dragonmapper.png?branch=develop

Dragon Mapper is a Python library that provides identification and conversion functions for Chinese text processing.

Features

  • Convert between Chinese characters, Pinyin, Zhuyin, and the International Phonetic Alphabet.
  • Identify a string as Traditional or Simplified Chinese, Pinyin, Zhuyin, or the International Phonetic Alphabet.
  • Output HTML of characters with Pinyin attached to them.
>>> from dragonmapper import hanzi
>>> s = '我是一个美国人。'
>>> hanzi.is_simplified(s)
True
>>> hanzi.to_pinyin(s)
'wǒshìyīgèměiguórén。'
>>> hanzi.to_pinyin(s, all_readings=True)
'[wǒ][shì/shi/tí][yī][gè/ge/gě/gàn][měi][guó][rén/ren]。'
>>> from dragonmapper import transcriptions as trans
>>> s = 'Wǒ shì yīgè měiguórén.'
>>> trans.is_pinyin(s)
True
>>> trans.pinyin_to_zhuyin(s)
'ㄨㄛˇ ㄕˋ ㄧ ㄍㄜˋ ㄇㄟˇ ㄍㄨㄛˊ ㄖㄣˊ.'
>>> trans.pinyin_to_ipa(s)
'wɔ˧˩˧ ʂɨ˥˩ i˥ kɤ˥˩ meɪ˧˩˧ kwɔ˧˥ ʐən˧˥.'
>>> from dragonmapper import transcriptions as trans
>>> form dragonmapper import hanzi
>>> from dragonmapper import html
>>> s = "我是加拿大人"
>>> zh = hanzi.to_zhuyin(s)
>>> pi = trans.zhuyin_to_pinyin(zh).split(' ')
>>> pi
['wǒ', 'shì', 'jiā', 'ná', 'dà', 'rén']
>>> h = html.to_html(s, top=pi)
>>> print(h)
  • The intermediate switch to Zhuyin, is because of spacing. You can space out the characters instead.
  • Note: only top is aviable right now, as browsers do not currently support having it elsewhere.
https://s25.postimg.org/4s44wylcv/Screenshot_from_2016_08_03_15_59_03.png

Getting Started