jehling/jisho_flashcard_scraper

"Usually written using kana alone" Edge Case

Closed this issue · 2 comments

Certain terms in Japanese (あの、どこ、アメリカ) are written without kanji. Jisho usually modifies the normal furigana html element to instead have furigana_justify with two children elements.

Modify the script to account for this condition and discriminate between normal conversion and terms that should be left as hiragana.

sense-tag tag-tag appears to be consistently tied to the phrase "usually written using kana alone". May be able to pivot off this instead of doing a string compare.

Layout
<span class="supplemental_info"> -> span class="sense-tag tag-tag">Usually...

Fixed in commit 941bb7