bcicen/wikitables

UnicodeEncodeError: 'ascii' codec can't encode character

Closed this issue · 8 comments

I had this UnicodeEncodeError at the import_tables('List of cities in Italy')
Googling gave this:

http://stackoverflow.com/questions/9942594/unicodeencodeerror-ascii-codec-cant-encode-character-u-xa0-in-position-20

Replacing this line:
https://github.com/bcicen/wikitables/blob/master/wikitables/models.py#L70
with
vals = [ p.value.encode('utf-8') for p in node.params if _is_int(p.name) ]
fixed the issue. Sorry no PR.

Actually... you'll want to replace all str in there...

Thanks @KKostya, updates for Python2 compatibility have been included in the latest release

We're not there yet. In 2.7, wikitables 0.3.1 still give UnicodeDecodeError

@Hunter-Github are you certain its still a UnicodeDecode error and not related to your local default encoding? The below works for me on 2.7:

# -*- coding: utf-8 -*-

import sys
from wikitables import import_tables
print(sys.version)
t = import_tables('İtalya\'daki_şehirler_listesi', 'tr') 
for row in t[0].rows:
    print(row)

output:

2.7.12 (default, Jun 28 2016, 08:31:05) 
[GCC 6.1.1 20160602]
{u'Kom\xfcn': Roma, u'B\xf6lge': Dosya:Flag of Lazio.svg Lazio, u'S\u0131ra No.': 1, u'Nuf\xfcs ki\u015fi': 2.748.809, u'\u0130l': Roma ili}
...

Your example works, mine doesn't, and all I change is the article and wiki:

   t1 = import_tables(u'List of apple cultivars', u'en')
   t2 = import_tables('List of apple cultivars', 'en')

Thanks @Hunter-Github, I was able to replicate and have published the fixes in v0.3.2

The apples work lovely now, can close the issue :)

great; thanks for the feedback!