Robpol86/terminaltables

encoding error

0x3h opened this issue · 3 comments

0x3h commented

UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 0: unexpected end of data

arr = []
arr.append([1, 2, "\xe9"])
table_data = [['a', 'b', 'c']]
for item in arr:
	table_data.append(item)
table = AsciiTable(table_data)
print "\n%s" % table.table

I suggest this edit string.decode('u8', 'ignore') on width_and_alignment.py#L28
As a workaround doing it explicitly eg: "\xe9".decode('u8', 'ignore') also does the job.

"\xe9".decode('u8', 'ignore') returns u'' which is a length of 0 instead of 1. That would lead to broken/misaligned tables.

0x3h commented

You can calculate non-utf8 chars length before decoding then later add the missing chars as ? or spaces.

I think this is out of scope for terminaltables. The exception is "unexpected end of data" which means it's a bad unicode string being fed to terminaltables (works on python3.5 but not python2.7, 3.5 probably handles bad unicode better).

I think the proper solution here is to have the caller handle bad unicode data and correct it before trying to use table.table in a try/except block.