获取自身微博信息
Hylan129 opened this issue · 9 comments
在抓取自己微博历史记录时,原程序一直抓不到,程序本身没有报错;抓取其他人的微博正常。
分析后发现微博地址更改下可用,在如下两处url后面增加“/profile",即可。
1:
def get_weibo_info(self): """获取微博信息""" try: url = 'https://weibo.cn/%s/profile' % (self.user_config['user_uri'])
2:
def get_one_page(self, page): """获取第page页的全部微博""" try: url = 'https://weibo.cn/%s/profile?page=%d' % ( self.user_config['user_uri'], page)
ps:更改成新地址后,抓取其他人的微博同样可用。
感谢反馈。
非常好的建议,但是与现在的部分功能冲突。现在user_id即可以是真实的用户id,也可以是个性域名,如胡歌的微博页是https://weibo.cn/hu_ge,其中“hu_ge”就是个性域名。添加“/profile”后,如果user_id写的是真实的id可以正确获取信息,但是如果写的是个性域名,就会获取失败。考虑到很多微博是个性域名形式,为了更好的扩展性,程序暂时不作修改。
再次感谢,如果发现其它问题,欢迎继续反馈:smile:
抓取不到图片的解决方法如下, 供参考
440 #first_pic = 'https://weibo.cn/mblog/pic/' + weibo_id + '?rl=0'
441 first_pic = 'https://weibo.cn/mblog/pic/' + weibo_id + '?rl=1'
抓取不到图片的解决方法如下, 供参考
440 #first_pic = 'https://weibo.cn/mblog/pic/' + weibo_id + '?rl=0' 441 first_pic = 'https://weibo.cn/mblog/pic/' + weibo_id + '?rl=1'
@purplepalmdash 感谢,已解决!
请问以上提到的两处url在代码文件的哪里呢?在 spider.py 看到 get_weibo_info 函数,但是看不到 url 的赋值。
@Yuuoniy 你好,url 的构建当前都在 parser 模块下:
https://github.com/dataabc/weiboSpider/tree/master/weibo_spider/parser
每个 parser 对应了一类相关 url.
请问以上提到的两处url在代码文件的哪里呢?在 spider.py 看到 get_weibo_info 函数,但是看不到 url 的赋值。
程序已经更新,URL的引用变了。我也遇到这个问题,parser目录下的index_parser, info_parser, page_parser 里的url相关地址我都加上profile 可是依然无法解析个人微博。看程序报错是xpath匹配不到数据 ,我有一些微博是仅自己可见的,但是分析微博的页面结构后发现仅自己可见的微博div和人的微博div并没有什么不同,不知道出错环节在哪里
报错信息如下
``list index out of range
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/weibo_spider/parser/info_parser.py", line 39, in extract_user_info
if self.selector.xpath(
IndexError: list index out of range
'NoneType' object has no attribute 'id'
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/weibo_spider/parser/index_parser.py", line 36, in get_user
self.user.id = user_id
AttributeError: 'NoneType' object has no attribute 'id'
None
'NoneType' object has no attribute 'nickname'
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/weibo_spider/spider.py", line 188, in _get_filepath
self.user.nickname)
AttributeError: 'NoneType' object has no attribute 'nickname'
expected str, bytes or os.PathLike object, not NoneType
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/weibo_spider/writer/csv_writer.py", line 25, in init
with open(self.file_path, 'a', encoding='utf-8-sig',
TypeError: expected str, bytes or os.PathLike object, not NoneType
'NoneType' object has no attribute 'nickname'
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/weibo_spider/spider.py", line 188, in _get_filepath
self.user.nickname)
AttributeError: 'NoneType' object has no attribute 'nickname'
'NoneType' object has no attribute 'nickname'
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/weibo_spider/spider.py", line 188, in _get_filepath
self.user.nickname)
AttributeError: 'NoneType' object has no attribute 'nickname'
'NoneType' object has no attribute 'nickname'
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/weibo_spider/spider.py", line 188, in _get_filepath
self.user.nickname)
AttributeError: 'NoneType' object has no attribute 'nickname'
'NoneType' object has no attribute 'dict'
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/weibo_spider/spider.py", line 269, in start
self.write_user(self.user)
File "/usr/local/lib/python3.9/site-packages/weibo_spider/spider.py", line 114, in write_user
writer.write_user(user)
File "/usr/local/lib/python3.9/site-packages/weibo_spider/writer/txt_writer.py", line 29, in write_user
[v + ':' + str(self.user.dict[k]) for k, v in self.user_desc])
File "/usr/local/lib/python3.9/site-packages/weibo_spider/writer/txt_writer.py", line 29, in
[v + ':' + str(self.user.dict[k]) for k, v in self.user_desc])
AttributeError: 'NoneType' object has no attribute 'dict'
``
@scriptway
是因为速度太快,被暂时限制了。要降低速度,按照常见问题的问题2修改就可以了。