Bug when encounters non-ASCII websites

Question

Bug when encounters non-ASCII websites

Opened this issue 9 years ago · 1 comments

Try this link http://www.ctw.cn/article/article_108380.html
Then get page.content

It looks perfect! Isn't it?
The problem is that the page.content is bytes not str.
But Python 3 interpreter believes its str, which means you cannot use decode/encode to fix this problem.

I should be able to decode page.content, but Python think its str without decode method.

The reason can be Python 2/3 incompatibility.
I fix this problem by using StringIO and eval, which is really sick,

My environment:

WIndows 10
Python 3.5
PyQt 4 Lastest
Ghost.py Lastest

Sry Guys.
I didn't read the API carefully.
It was my mistake, not bug.

Answer 1 · 2015-08-14T20:29:58.000Z

I have to do sth like eval(StringIO(page.content).getvalue()).decode('UTF-8')