jeanphix/Ghost.py

Bug when encounters non-ASCII websites

Opened this issue · 1 comments

Try this link http://www.ctw.cn/article/article_108380.html
Then get page.content

qq

It looks perfect! Isn't it?
The problem is that the page.content is bytes not str.
But Python 3 interpreter believes its str, which means you cannot use decode/encode to fix this problem.

I should be able to decode page.content, but Python think its str without decode method.

The reason can be Python 2/3 incompatibility.
I fix this problem by using StringIO and eval, which is really sick,

My environment:

WIndows 10
Python 3.5
PyQt 4 Lastest
Ghost.py Lastest


Sry Guys.
I didn't read the API carefully.
It was my mistake, not bug.

I have to do sth like eval(StringIO(page.content).getvalue()).decode('UTF-8')