myint/language-check

Unable to reproduce example usage in documentation

Closed this issue · 2 comments

Hello, when I try to reproduce the example usage in the documentation (using Python 2.7.10 via the Anaconda distribution) I encounter an encoding error:

>>> import language_check
>>> tool = language_check.LanguageTool('en-US')
>>> text = 'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy'
>>> matches = tool.check(text)
Traceback (most recent call last):
  File "", line 1, in 
  File "/afs/crc.nd.edu/user/d/dduhaime/anaconda/lib/python2.7/site-packages/language_check-0.7.2-py2.7.egg/language_check/__init__.py", line 243, in check
    root = self._get_root(self._url, self._encode(text, srctext))
  File "/afs/crc.nd.edu/user/d/dduhaime/anaconda/lib/python2.7/site-packages/language_check-0.7.2-py2.7.egg/language_check/__init__.py", line 253, in _encode
    params = {u'language': self.language, u'text': text.encode(u'utf-8')}
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 41: ordinal not in range(128)

Does anyone have thoughts on what might be going on?

myint commented

text has a smart quote in it, so the string should be specified Unicode. In Python 3, you don't have to do anything special since strings are Unicode by default. But in Python 2, you need to either use from __future__ import unicode_literals or prefix your string with a u as in the following.

import language_check
tool = language_check.LanguageTool('en-US')
text = u'A sentence with a error in the’s Guide tot he Galaxy'
matches = tool.check(text)
myint commented

Thanks for pointing this out.