/guess-language

Attempts to determine the natural language of a selection of Unicode (utf-8) text (a clone of http://code.google.com/p/guess-language with package metadata)

Primary LanguagePythonGNU Lesser General Public License v2.1LGPL-2.1

guess-language

Attempts to determine the natural language of a selection of Unicode (utf-8) text.

Based on guesslanguage.cpp by Jacob R Rideout for KDE which itself is based on Language::Guess by Maciej Ceglowski. Original repo is at Google Code; repackaged with package metadata here.

Detects over 60 languages; Greek (el), Korean (ko), Japanese (ja), Chinese (zh) and all the languages listed in the trigrams directory.

Install from Github

I recommend using Pip:

$ pip install -e 'git://github.com/dsc/guess-language.git#egg=guess-language'

If you prefer easy_install:

$ git clone git://github.com/dsc/guess-language.git
$ easy_install guess-language

The old-school way also works:

$ git clone git://github.com/dsc/guess-language.git
$ cd guess-language
$ python setup.py install