Ruby Gem to identifies languages from a string. Prose literally means, written or spoken language in its ordinary form, without metrical structure.
gem install prose
require 'prose'
"אודם".prose # will return ['hebrew']
"Ruby".prose # will return ['latin']
"हिन्दी".prose # will return ['devanagari']
"אודם ruby".prose # will return ['hebrew', 'latin']
"لعربية".arabic? # will return true. This will work only for languages identified by Prose
"Peace".latin? # will return true. But .english? will return an error
"אודם لعربية".hebrew? #will return true, since the string contains Hebrew.
"אודם لعربية".pure_hebrew? - version `0.2.3`
"אודם لعربية".hebrew?(pure = true) - version `0.2.2`
# will return false, since the string contains Arabic as well.
# This will return ture only when the string is pure Hebrew.
"אודם لعربية".percentage_of('hebrew')
# will return an integer for what percentage of characters in the text is hebrew.
Since we are dealing with unicode the gem dosent necessarily identifies the language all the time instead identifies the origin of the script. Example English alphabets belongs to Latin alphabet set. And Hindi letters belong to Devanagri. Havent included CJK yet.
CJK and other symbols will be recognaised in future.