UnicodeEncodeError: 'latin-1' codec can't encode character ...
Closed this issue · 12 comments
hrmzone commented
运行环境:windows11和Ubuntu20
Python:3
报错如下:
Download HTML...Traceback (most recent call last):
File "main.py", line 90, in <module>
page = urllib.request.urlopen(request)
File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.8/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/usr/lib/python3.8/urllib/request.py", line 542, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
result = func(*args)
File "/usr/lib/python3.8/urllib/request.py", line 1397, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "/usr/lib/python3.8/urllib/request.py", line 1354, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/usr/lib/python3.8/http/client.py", line 1256, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.8/http/client.py", line 1297, in _send_request
self.putheader(hdr, value)
File "/usr/lib/python3.8/http/client.py", line 1229, in putheader
values[i] = one_value.encode('latin-1')
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2026' in position 512: ordinal not in range(256)
BoyInTheSun commented
查验过后发现应该是请求时的header包含了类似中文字符导致编码错误,这是不应该出现的,请检查提供的cookies.txt是否包含中文字符
hrmzone commented
您的邮件已收到,我会在第一时间回复。谢谢,张老师(18696080268)我公司&培训机构,业务涵盖人才、培训、职业资格、学历提升。
BoyInTheSun commented
尤其复制cookies是是否包含省略号
hrmzone commented
您好。复制cookie,是将header的cookie的内容全部复制出来吗?我是复制了全部内容,就是把cookie:
头删掉了!
BoyInTheSun commented
对的,应该是这样操作的。'\u2026'对应的unicode字符是…,这不应该出现在cookies里,请检查。
hrmzone commented
好的,谢谢指点,我再试试,感谢!
hrmzone commented
大佬,能帮我再看一下不,我试了还是报错。
url:https://wenku.baidu.com/view/0ff393d025fff705cc1755270722192e45365883
Cookie: viewedPg=63b8ce1f85868762caaedd3383c4bb4cf6ecb708%3D3%7C0%26e37de277afaad1f34693daef5ef7ba0d4b736d16%3D3%7C0%26f8e6d67b0875f46527d3240c844769eae109a35a%3D2%7C0; wkview_gotodaily_tip=1; aplugin=uinfo5; kunlunFlag=1; BAIDUID=A56FB6C9E28B28C64B7690D0315CACD4:FG=1; BIDUPSID=D14A6425ECB0B2208664B24751743CF2; PSTM=1641800768; Hm_lvt_d8bfb560f8d03bbefc9bdecafc4a4bf6=1641954725,1641954997,1641955005,1641956128; _click_param_reader_query_ab=-1; layer_show_times_total_1_6807f15f9de6a7fa67e953896b2985a6=3; layer_show_times_total_8_6807f15f9de6a7fa67e953896b2985a6=2; _click_param_pc_rec_doc_2017_testid=5; BDUSS=2NtVmtRWUxqNDRVT3JNa2tIVTRGb2I1dH5tY1d3YzBWUWJ3LXlsbzBtTlVNVHhpRVFBQUFBJCQAAAAAAQAAAAEAAADML9gQyq-80tevtaXV0M34AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFSkFGJUpBRiT; layer_show_times_total_1_eccb7dec0932e76c3e4c43232ec07595=3; layer_show_times_total_8_eccb7dec0932e76c3e4c43232ec07595=2; layer_show_times_total_1_eb2f2a481276882210a89546db085583=1; layer_show_times_total_8_eb2f2a481276882210a89546db085583=1; layer_show_times_total_5_eb2f2a481276882210a89546db085583=1; layer_show_times_total_1_fa0c98f85484b748dffce484c700633a=3; layer_show_times_total_8_fa0c98f85484b748dffce484c700633a=2; layer_show_times_total_5_fa0c98f85484b748dffce484c700633a=4; RT="z=1&dm=baidu.com&si=smhk1c51u&ss=l28qe4j0&sl=0&tt=0&bcn=https%3A%2F%2Ffclog.baidu.com%2Flog%2Fweirwood%3Ftype%3Dperf&ul=bb00&hd=bb55"
BoyInTheSun commented
那么如果不带cookies能否正常下载呢
BoyInTheSun commented
我这里使用你的cookie可以正常下载无报错
hrmzone commented
不用cookie只能下载几页前三页!
hrmzone commented
问题解决了。
可能是编码的问题,从cookie.txt文件无法读取,使用-c选项命令没有问题,谢谢您的耐心指导!
祝工作顺利,身体安康!
BoyInTheSun commented
:)