UnicodeEncodeError in Python 2.7.3
Closed this issue · 2 comments
jag34 commented
Testing with the article:
http://www.reuters.com/article/2014/08/29/us-syria-crisis-obama-strategy-idUSKBN0GS2KT20140829
produced
File "summarize.py", line 30, in u
return codecs.unicode_escape_decode(s)[0]
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 88: ordinal not in range(128)
Current work around using:
return s.encode('ascii', 'replace')
It seems to crash in the following paragraph
Representative Tom Price, a Georgia Republican, said on Twitter: "President says "we don’t have a strategy yet" to deal with #ISIS.
Kentucky Senator Mitch McConnell, the top Republican in the Senate, said he thought Obama would have “significant congressional support” if he provided a strategic plan to protect the United States and its allies from the Sunni militants.
vgel commented
Hmm, I'm looking into this. I'm not sure why that codecs call is trying to create an ascii string at all, the point is that it returns a unicode string.
vgel commented
Sorry for the long lead time, but I just commited a fix for this.