Bug when encounters non-ASCII websites
Opened this issue · 1 comments
Deleted user commented
Try this link http://www.ctw.cn/article/article_108380.html
Then get page.content
It looks perfect! Isn't it?
The problem is that the page.content is bytes not str.
But Python 3 interpreter believes its str, which means you cannot use decode/encode to fix this problem.
I should be able to decode page.content, but Python think its str without decode method.
The reason can be Python 2/3 incompatibility.
I fix this problem by using StringIO and eval, which is really sick,
My environment:
WIndows 10
Python 3.5
PyQt 4 Lastest
Ghost.py Lastest
Sry Guys.
I didn't read the API carefully.
It was my mistake, not bug.
Deleted user commented
I have to do sth like eval(StringIO(page.content).getvalue()).decode('UTF-8')