BoyInTheSun/wks

UnicodeEncodeError: 'latin-1' codec can't encode character ...

Closed this issue · 12 comments

运行环境:windows11和Ubuntu20
Python:3
报错如下:

Download HTML...Traceback (most recent call last):
  File "main.py", line 90, in <module>
    page = urllib.request.urlopen(request)
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/usr/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 1397, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/usr/lib/python3.8/urllib/request.py", line 1354, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/usr/lib/python3.8/http/client.py", line 1256, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1297, in _send_request
    self.putheader(hdr, value)
  File "/usr/lib/python3.8/http/client.py", line 1229, in putheader
    values[i] = one_value.encode('latin-1')
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2026' in position 512: ordinal not in range(256)

查验过后发现应该是请求时的header包含了类似中文字符导致编码错误,这是不应该出现的,请检查提供的cookies.txt是否包含中文字符

尤其复制cookies是是否包含省略号

您好。复制cookie,是将header的cookie的内容全部复制出来吗?我是复制了全部内容,就是把cookie:头删掉了!

对的,应该是这样操作的。'\u2026'对应的unicode字符是…,这不应该出现在cookies里,请检查。

好的,谢谢指点,我再试试,感谢!

大佬,能帮我再看一下不,我试了还是报错。
url:https://wenku.baidu.com/view/0ff393d025fff705cc1755270722192e45365883

Cookie: viewedPg=63b8ce1f85868762caaedd3383c4bb4cf6ecb708%3D3%7C0%26e37de277afaad1f34693daef5ef7ba0d4b736d16%3D3%7C0%26f8e6d67b0875f46527d3240c844769eae109a35a%3D2%7C0; wkview_gotodaily_tip=1; aplugin=uinfo5; kunlunFlag=1; BAIDUID=A56FB6C9E28B28C64B7690D0315CACD4:FG=1; BIDUPSID=D14A6425ECB0B2208664B24751743CF2; PSTM=1641800768; Hm_lvt_d8bfb560f8d03bbefc9bdecafc4a4bf6=1641954725,1641954997,1641955005,1641956128; _click_param_reader_query_ab=-1; layer_show_times_total_1_6807f15f9de6a7fa67e953896b2985a6=3; layer_show_times_total_8_6807f15f9de6a7fa67e953896b2985a6=2; _click_param_pc_rec_doc_2017_testid=5; BDUSS=2NtVmtRWUxqNDRVT3JNa2tIVTRGb2I1dH5tY1d3YzBWUWJ3LXlsbzBtTlVNVHhpRVFBQUFBJCQAAAAAAQAAAAEAAADML9gQyq-80tevtaXV0M34AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFSkFGJUpBRiT; layer_show_times_total_1_eccb7dec0932e76c3e4c43232ec07595=3; layer_show_times_total_8_eccb7dec0932e76c3e4c43232ec07595=2; layer_show_times_total_1_eb2f2a481276882210a89546db085583=1; layer_show_times_total_8_eb2f2a481276882210a89546db085583=1; layer_show_times_total_5_eb2f2a481276882210a89546db085583=1; layer_show_times_total_1_fa0c98f85484b748dffce484c700633a=3; layer_show_times_total_8_fa0c98f85484b748dffce484c700633a=2; layer_show_times_total_5_fa0c98f85484b748dffce484c700633a=4; RT="z=1&dm=baidu.com&si=smhk1c51u&ss=l28qe4j0&sl=0&tt=0&bcn=https%3A%2F%2Ffclog.baidu.com%2Flog%2Fweirwood%3Ftype%3Dperf&ul=bb00&hd=bb55"

那么如果不带cookies能否正常下载呢

我这里使用你的cookie可以正常下载无报错

不用cookie只能下载几页前三页!

问题解决了。
可能是编码的问题,从cookie.txt文件无法读取,使用-c选项命令没有问题,谢谢您的耐心指导!
祝工作顺利,身体安康!