Small Strings
etm opened this issue · 0 comments
etm commented
CharlockHolmes::EncodingDetector.detect_all("Timeout: 2")
results in
{:type=>:text, :encoding=>"IBM424_ltr", :ruby_encoding=>"binary", :confidence=>27, :language=>"he"},
{:type=>:text, :encoding=>"UTF-8", :ruby_encoding=>"UTF-8", :confidence=>15},
....
in general it seems to try too hard for small strings. for small strings it often favors esoteric (wrong) results over obvious ones.
is it possible to tweak this? is this intended?