一箇年 and 堪へる are missing kana_text, causing internal server error
AnyhowStep opened this issue · 2 comments
AnyhowStep commented
Links:
- https://ichi.moe/cl/word/?q=%E4%B8%80%E7%AE%87%E5%B9%B4
- https://ichi.moe/cl/word/?q=%E5%A0%AA%E3%81%B8%E3%82%8B
And I'm pretty sure the server is otherwise OK because other searches work fine,
I looked at the DB dump in the latest release and noticed those two words are missing kana_text rows.
So, there was probably a bug with parsing JMDict.
The entry content clearly shows that there should be kana.
一箇年 (いっかねん)
<?xml version="1.0" encoding="UTF-8"?>\n
<entry>\n
<ent_seq>1161240</ent_seq>\n
<k_ele>\n
<keb>一箇年</keb>\n
</k_ele>\n
<r_ele>\n
<reb>いっかねん</reb>\n
<re_inf>ok</re_inf>\n
</r_ele>\n
<sense>\n
<pos>n</pos>\n
<gloss xml:lang="eng">one year</gloss>\n
</sense>\n
</entry>
堪へる (たへる)
<?xml version="1.0" encoding="UTF-8"?>\n
<entry>\n
<ent_seq>2209300</ent_seq>\n
<k_ele>\n
<keb>堪へる</keb>\n
</k_ele>\n
<r_ele>\n
<reb>たへる</reb>\n
<re_inf>ok</re_inf>\n
</r_ele>\n
<sense>\n
<pos>v1</pos>\n
<pos>vi</pos>\n
<pos>vt</pos>\n
<xref>堪える・1</xref>\n
<gloss xml:lang="eng">to bear</gloss>\n
<gloss xml:lang="eng">to stand</gloss>\n
<gloss xml:lang="eng">to endure</gloss>\n
<gloss xml:lang="eng">to put up with</gloss>\n
</sense>\n
</entry>
There might be other inconsistencies in the database (like entry.n_kana, entry.n_kanji, kana_text.nokanji, etc. being wrong) but I didn't check.
tshatrov commented
Thanks for spotting this. It seems in JMdict the only kana spelling of these words is tagged with [ok] which means "outdated or obsolete kana usage", which gets filtered out by ichiran. I'll add these spellings manually I guess.
tshatrov commented
fixed