language detection issues
Opened this issue · 2 comments
dom1nga commented
"i hate you".language # => "norwegian"
"i hate you so much".language # => "english"
"i love you".language # => "czech"
"kiss me".language # => "finnish"
"talk to me".language # => "italian"
hashwin commented
@Laykou @dom1nga this library is based on textcat which uses n-grams to detect a language, not any particular language's dictionary. It can get confused when the input is very short and is as such unreliable in those cases.
My suggestion would be to only trust the result if the input text is at least 5 words long, 10 to be on the safe side.