/jamdict

Python 3 library for manipulating Jim Breen's JMdict, KanjiDic2, JMnedict and kanji-radical mappings

Primary LanguagePythonMIT LicenseMIT

Jamdict

Jamdict is a Python 3 library for manipulating Jim Breen's JMdict, KanjiDic2, JMnedict and kanji-radical mappings.

ReadTheDocs Badge

Documentation: https://jamdict.readthedocs.io/

Main features

  • Support querying different Japanese language resources
    • Japanese-English dictionary JMDict
    • Kanji dictionary KanjiDic2
    • Kanji-radical and radical-kanji maps KRADFILE/RADKFILE
    • Japanese Proper Names Dictionary (JMnedict)
  • Fast look up (dictionaries are stored in SQLite databases)
  • Command-line lookup tool (Example)

Contributors are welcome! πŸ™‡. If you want to help, please see Contributing page.

Try Jamdict out

Jamdict is used in Jamdict-web - a web-based free and open-source Japanese reading assistant software. Please try out the demo instance online at:

https://jamdict.herokuapp.com/

There also is a demo Jamdict virtual machine online for trying out Jamdict Python code on Repl.it:

https://replit.com/@tuananhle/jamdict-demo

Installation

Jamdict & Jamdict database are both available on PyPI and can be installed using pip

pip install --upgrade jamdict jamdict-data

Sample jamdict Python code

from jamdict import Jamdict
jam = Jamdict()

# use wildcard matching to find anything starts with 食べ and ends with γ‚‹
result = jam.lookup('食べ%γ‚‹')

# print all word entries
for entry in result.entries:
     print(entry)

# [id#1358280] γŸγΉγ‚‹ (ι£ŸγΉγ‚‹) : 1. to eat ((Ichidan verb|transitive verb)) 2. to live on (e.g. a salary)/to live off/to subsist on
# [id#1358300] γŸγΉγ™γŽγ‚‹ (ι£ŸγΉιŽγŽγ‚‹) : to overeat ((Ichidan verb|transitive verb))
# [id#1852290] γŸγΉγ€γ‘γ‚‹ (ι£ŸγΉδ»˜γ‘γ‚‹) : to be used to eating ((Ichidan verb|transitive verb))
# [id#2145280] γŸγΉγ―γ˜γ‚γ‚‹ (ι£ŸγΉε§‹γ‚γ‚‹) : to start eating ((Ichidan verb))
# [id#2449430] γŸγΉγ‹γ‘γ‚‹ (ι£ŸγΉζŽ›γ‘γ‚‹) : to start eating ((Ichidan verb))
# [id#2671010] たべγͺγ‚Œγ‚‹ (ι£ŸγΉζ…£γ‚Œγ‚‹) : to be used to eating/to become used to eating/to be accustomed to eating/to acquire a taste for ((Ichidan verb))
# [id#2765050] γŸγΉγ‚‰γ‚Œγ‚‹ (ι£ŸγΉγ‚‰γ‚Œγ‚‹) : 1. to be able to eat ((Ichidan verb|intransitive verb)) 2. to be edible/to be good to eat ((pre-noun adjectival (rentaishi)))
# [id#2795790] γŸγΉγγ‚‰γΉγ‚‹ (ι£ŸγΉζ―”γΉγ‚‹) : to taste and compare several dishes (or foods) of the same type ((Ichidan verb|transitive verb))
# [id#2807470] γŸγΉγ‚γ‚γ›γ‚‹ (ι£ŸγΉεˆγ‚γ›γ‚‹) : to eat together (various foods) ((Ichidan verb))

# print all related characters
for c in result.chars:
    print(repr(c))

# 食:9:eat,food
# ε–°:12:eat,drink,receive (a blow),(kokuji)
# 過:12:overdo,exceed,go beyond,error
# 付:5:adhere,attach,refer to,append
# 始:8:commence,begin
# ζŽ›:11:hang,suspend,depend,arrive at,tax,pour
# ζ…£:14:accustomed,get used to,become experienced
# ζ―”:4:compare,race,ratio,Philippines
# 合:6:fit,suit,join,0.1

Command line tools

To make sure that jamdict is configured properly, try to look up a word using command line

python3 -m jamdict lookup 言θͺžε­¦
========================================
Found entries
========================================
Entry: 1264430 | Kj:  言θͺžε­¦ | Kn: γ’γ‚“γ”γŒγ
--------------------
1. linguistics ((noun (common) (futsuumeishi)))

========================================
Found characters
========================================
Char: 言 | Strokes: 7
--------------------
Readings: yan2, eon, μ–Έ, NgΓ΄n, NgΓ’n, ゲン, ゴン, い.う, こと
Meanings: say, word
Char: θͺž | Strokes: 14
--------------------
Readings: yu3, yu4, eo, μ–΄, Ngα»―, Ngα»©, γ‚΄, γ‹γŸ.γ‚‹, γ‹γŸ.らう
Meanings: word, speech, language
Char: ε­¦ | Strokes: 8
--------------------
Readings: xue2, hag, ν•™, HoΜ£c, ガク, まγͺ.ぢ
Meanings: study, learning, science

No name was found.

Using KRAD/RADK mapping

Jamdict has built-in support for KRAD/RADK (i.e. kanji-radical and radical-kanji mapping). The terminology of radicals/components used by Jamdict can be different from else where.

  • A radical in Jamdict is a principal component, each character has only one radical.
  • A character may be decomposed into several writing components.

By default jamdict provides two maps:

  • jam.krad is a Python dict that maps characters to list of components.
  • jam.radk is a Python dict that maps each available components to a list of characters.
# Find all writing components (often called "radicals") of the character ι›²
print(jam.krad['ι›²'])
# ['δΈ€', '雨', '二', '厢']

# Find all characters with the component 鼎
chars = jam.radk['鼎']
print(chars)
# {'鼏', 'ιΌ’', '鼐', '鼎', 'ιΌ‘'}

# look up the characters info
result = jam.lookup(''.join(chars))
for c in result.chars:
    print(c, c.meanings())
# 鼏 ['cover of tripod cauldron']
# ιΌ’ ['large tripod cauldron with small']
# 鼐 ['incense tripod']
# 鼎 ['three legged kettle']
# ιΌ‘ []

Finding name entities

# Find all names with 鈴木 inside
result = jam.lookup('%鈴木%')
for name in result.names:
    print(name)

# [id#5025685] γ‚­γƒ₯γƒΌγƒ†γ‚£γƒΌγ™γšγ (γ‚­γƒ₯γƒΌγƒ†γ‚£γƒΌιˆ΄ζœ¨) : Kyu-ti- Suzuki (1969.10-) (full name of a particular person)
# [id#5064867] γƒ‘γƒ‘γ‚€γƒ€γ™γšγ (γƒ‘γƒ‘γ‚€γƒ€ιˆ΄ζœ¨) : Papaiya Suzuki (full name of a particular person)
# [id#5089076] γƒ©γ‚Έγ‚«γƒ«γ™γšγ (γƒ©γ‚Έγ‚«γƒ«ιˆ΄ζœ¨) : Rajikaru Suzuki (full name of a particular person)
# [id#5259356] γγ€γ­γ–γγ™γšγγ²γͺた (η‹ε΄Žιˆ΄ζœ¨ζ—₯向) : Kitsunezakisuzukihinata (place name)
# [id#5379158] γ“γ™γšγ (小鈴木) : Kosuzuki (family or surname)
# [id#5398812] γ‹γΏγ™γšγ (上鈴木) : Kamisuzuki (family or surname)
# [id#5465787] γ‹γ‚γ™γšγ (川鈴木) : Kawasuzuki (family or surname)
# [id#5499409] γŠγŠγ™γšγ (倧鈴木) : Oosuzuki (family or surname)
# [id#5711308] すすき (鈴木) : Susuki (family or surname)
# ...

Exact matching

Use exact matching for faster search.

Find the word 花火 by idseq (1194580)

>>> result = jam.lookup('id#1194580')
>>> print(result.names[0])
[id#1194580] はγͺび (花火) : fireworks ((noun (common) (futsuumeishi)))

Find an exact name 花火 by idseq (5170462)

>>> result = jam.lookup('id#5170462')
>>> print(result.names[0])
[id#5170462] はγͺび (花火) : Hanabi (female given name or forename)

See jamdict_demo.py and jamdict/tools.py for more information.

Useful links

Contributors