WorksApplications/Sudachi

`抑える` is normalized to `押さえる`.

i10416 opened this issue · 2 comments

抑える is normalized to 押さえる. Is this expected behavior?

  • Sudachi 0.7.0
  • sudachi-dictionary-20220729/system_core.dic
{
    "systemDict" : "/dbfs/FileStore/sudachi/system_core.dic",
    "oovProviderPlugin" : [
        { "class" : "com.worksap.nlp.sudachi.SimpleOovProviderPlugin",
          "oovPOS" : [ "名詞", "普通名詞", "一般", "*", "*", "*" ]}
    ]
}

This result is as intended.
In Japanese dictionaries, 「押さえる」 and 「抑える」 are treated as the same word. Since 「押さえる」 is considered more basic, these are normalized to 「押さえる」 in SudachiDict.

SudachiDict/src/main/text/small_lex.csv:抑える,1080,1080,10832,抑える,動詞,一般,*,*,下一段-ア行,終止形-一般,オサエル,押さえる,453307,A,*,*,*,*

Thank you for quick response!