/dragonmasher

Dragon Masher provides access to Chinese word/character data.

Primary LanguagePythonBSD 2-Clause "Simplified" LicenseBSD-2-Clause

Dragon Masher

Dragon Masher provides access to Chinese word/character data.

NOTE: Dragon Masher was never released and is no longer maintained.

Dragon Masher is a Python library that helps you create customizable Chinese language data.

Features

Dragon Masher currently supports the following data sources:

  • CC-CEDICT
  • HSK vocabulary list
  • TOCFL vocabulary list
  • Unihan
  • SUBTLEX-CH word frequency data
  • Leiden Weibo Corpus word frequency data
  • Jun Da character frequency data
  • 《现代汉语常用字表》(Modern Chinese Commonly-used Characters)

Planned features:

  • Command-line interface
  • Zhuyin (Bopomofo) support
  • SUBTLEX-CH character frequency data
  • CJK Decomposition Data
  • HanDeDict
  • CFDICT
  • CC-ChEDICC

You can also easily create your own classes for local or remote data sources that aren't listed above.

Bug/Issues Tracker

Dragon Masher uses its GitHub Issues page to track bugs, feature requests, and support questions.

License

Dragon Masher is released under the OSI-approved BSD license. See the file LICENSE for more information.