Unify similar symbols
iLeonidze opened this issue · 2 comments
iLeonidze commented
Currently there is a problem, in Russian sentences there are a lot of similar characters, but they are different. It will be very nice to have an ability to convert multiple similar symbols to single one.
For example we have symbols:
-
U+002D : HYPHEN-MINUS {hyphen or minus sign}‐
U+2010 : HYPHEN‑
U+2011 : NON-BREAKING HYPHEN–
U+2013 : EN DASH—
U+2014 : EM DASH―
U+2015 : HORIZONTAL BAR {quotation dash}−
U+2212 : MINUS SIGN
which should be converted to single -
U+002D : HYPHEN-MINUS {hyphen or minus sign}
MichaelKohler commented
If we implement #9 in a generic way, I think this could be done through that as well, do you agree?
iLeonidze commented
Yeah, seems it will be possible to add conversion rule via this feature