/jdict2db

Create database representations of various Japanese-English dictionaries that are originally provided as XML files.

Primary LanguagePythonOtherNOASSERTION

This project is primarily for personal use and experimentation. For a more 
robust database implementation of dictionaries maintained by the EDRDG,
look into the JMdictDB project:
http://www.edrdg.org/wiki/index.php/JMdictDB_Project

This project creates database representations of various Japanese-English
dictionaries that are originally distributed as XML files.
 
The dictionaries used are the work of Jim Breen. More information can be found
at their respective pages:

KANJIDIC2 page: http://www.csse.monash.edu.au/~jwb/kanjidic2/
JMdict page: http://www.csse.monash.edu.au/~jwb/jmdict.html

License information can be found in the file COPYING.

Presently, KANJIDIC2 and JMdict are supported by this project. JMnedict
(Japanese Proper Names Dictionary) will be added soon.

Requirements
---------------------
python 2.x
sqlalchemy (only tested with 0.6.7)

TODO
---------------------
Insertions are reliant on correct (list) ordering, which could change at any
time. I need to fix this. Performance can be enhanced significantly with proper
use of transactions, but it's a very low priority goal right now.
    
Usage
---------------
The XML dictionary files are automatically downloaded and extracted for you.

Simply run jdict2db/jmdict.py or jdict2db/kanjidic.py with a python2 interpreter.
E.g., from the directory of this file, do:

$ python jdict2db/jmdict.py

to generate a database named jdict.sqlite in the current directory. For
kanjidic, do

$ python jdict2db/kanjidic.py

to generate a database named kanjidic.sqlite in the current directory.