A script to convert the JMdict_e gzip file into a sqlite3 relational database.
- python3
- JMdict_e.gz file (avaliable from edrdg.org)
1. Clone this respository to your computer
2. Download the latest JMdict_e.gz file (see below).
3. Copy JMdict_e.gz into the JMDict2SQL directory.
4. Run ./setup.sh to create an sqlite3 database (JMdict_e.db)*
5. Find the database file, JMdict_e.db, in the JMdict2SQL directory.
* you might need to run chmod +x setup.sh
first.
Click here for the latest JMdict_e.gz file
JMdict_e.db is a fully relational database. The format for the database tables is as follows:
entry
[ (PK) id ]id
: a unique ID for each entry
-
kanji
[ (PK) id, (FK) entry_id, value ]id
: a unique ID for each kanjientry_id
: foreign key fromentry
tablevalue
: kanji value for the entry
-
kanji_tags
[ (PK) id, (FK) kanji_id, value ]id
: a unique ID for each kanji_tag recordkanji_id
: foreign key fromkanji
tablevalue
: info related to the associated kanji
-
kanji_common
[ (PK) id, (FK) kanji_id, value ]id
: a unique ID for each kanji_common recordkanji_id
: foreign key fromkanji
tablevalue
: denotes how common a kanji is
-
kana
[ (PK) id, (FK) entry_id, value, no_kanji ]id
: a unique ID for each kanaentry_id
: foreign key fromentry
tablevalue
: kanji value for the entryno_kanji
: if 0, the kana is not the true reading of the kanji
-
kana_tags
[ (PK) id, (FK) kana_id, value ]id
: a unique ID for each kana_tag recordkana_id
: foreign key fromkana
tablevalue
: info related to the associated kana
-
kana_common
[ (PK) id, (FK) kana_id, value ]id
: a unique ID for each kana_common recordkana_id
: foreign key fromkana
tablevalue
: denotes how common a kana is
-
kana_applies_to_kanji
[ (PK) id, (FK) kana_id, value ]id
: a unique ID for each kana_applies_to_kanji recordkana_id
: foreign key fromkana
tablevalue
: denotesthat the kanji applies to the current kana
-
sense
[ (PK) id, (FK) entry_id ]id
: a unique ID for each senseentry_id
: foreign key fromentry
table
-
sense_applies_to_kanji
[ (PK) id, (FK) sense_id, value ]id
: a unique ID for each sense_applies_to_kanji recordsense_id
: foreign key fromsense
tablevalue
: denotes that the sense applies to the current kanji
-
sense_applies_to_kana
[ (PK) id, (FK) sense_id, value ]id
: a unique ID for each sense_applies_to_kana recordsense_id
: foreign key fromsense
tablevalue
: denotes that the sense applies to the current kana
-
part_of_speech
[ (PK) id, (FK) sense_id, value ]id
: a unique ID for each recordsense_id
: foreign key fromsense
tablevalue
: denotes the part of speech of the sense (eg. noun, adjective...)
-
cross_reference
[ (PK) id, (FK) sense_id, value ]id
: a unique ID for each recordsense_id
: foreign key fromsense
tablevalue
: references another entry with a similar meaning
-
antonym
[ (PK) id, (FK) sense_id, value ]id
: a unique ID for each recordsense_id
: foreign key fromsense
tablevalue
: references another entry that is the antonym of the current sense
-
field
[ (PK) id, (FK) sense_id, value ]id
: a unique ID for each recordsense_id
: foreign key fromsense
tablevalue
: information about the field of application
-
misc
[ (PK) id, (FK) sense_id, value ]id
: a unique ID for each recordsense_id
: foreign key fromsense
tablevalue
: miscellaneous information about the sense
-
sense_info
[ (PK) id, (FK) sense_id, value ]id
: a unique ID for each recordsense_id
: foreign key fromsense
tablevalue
: indicates level of currency of a sense, the regional variations, etc. of the sense
-
lang_source
[ (PK) id, (FK) sense_id, origin, lang, type, wasei ]id
: a unique ID for each recordsense_id
: foreign key fromsense
tableorigin
: where the entry originates from (can be NULL)lang
: the language of the origintype
: describes whether the sense fully or partially describes the source wordwasei
: denotes "Japanese-language expressions based on English words, or parts of word combinations, that do not exist in standard English or whose meanings differ from the words from which they were derived." Check Wasei-eigo
-
dialect
[ (PK) id, (FK) sense_id, value ]id
: a unique ID for each recordsense_id
: foreign key fromsense
tablevalue
: the dialect of the entry (Kansai-ben, Hokkaido-ben, etc.)
-
definition
[ (PK) id, (FK) sense_id, value, lang, type ]id
: a unique ID for each recordsense_id
: foreign key fromsense
tablevalue
: definition of the current entrylang
: language of the definitiontype
: denotes literal (lit), figurative (fig), explanation (expl)... of the sense
(ER Diagram created using DbVisualizer)