Not using correct right_id to calculate cost?
BlueGreenMagick opened this issue · 1 comments
Is there a reason this crate only uses a word entry's left_id to calculate cost? left_id()
and right_id()
both returns entry.cost_id
, which is the second entry in lex.csv, which is the left cost id. The third entry which is the right cost id is not being used at all.
lindera/lindera-core/src/word_entry.rs
Lines 34 to 40 in 8c30108
lindera/lindera-unidic-builder/src/unidic_builder.rs
Lines 213 to 228 in 8c30108
@BlueGreenMagick
In IPADIC, the right context ID and the left context ID were the same value, so this is a trick to reduce the binary size of the dictionary as much as possible.
There are cases where other dictionaries have different values, so this code should be corrected.
Thanks for your comment. 👍