Issues
- 0
- 3
请教怎样控制爬虫延时或者暂停?
#1129 opened by Mr-LiuDC - 2
请教怎样实现一个基于数据库的Scheduler?
#1128 opened by Mr-LiuDC - 2
希望作者可以将doCycleRetry改成protect访问级别
#1145 opened by hackeryutu - 3
希望作者支持一下动态重试?
#1141 opened by sparrow-ez - 0
循环点击下一页,并设置循环结束条件
#1144 opened by adminjohn - 1
使用RedisScheduler会报找不到rpush方法,使用默认的Schduler正常。
#1143 opened by hackeryutu - 1
如何让多个spider顺序执行
#1118 opened by getideas - 2
启动时,自定义参数放在Request后面接收不到
#1133 opened by yangjinde - 2
多久支持playwright
#1140 opened by 694475668 - 1
- 1
文档中的图片都变破图了,请修复,谢谢!
#1132 opened by szRyu666 - 0
如何保持cookie不过期?
#1130 opened by getideas - 0
- 1
HttpClientDownloader download error
#1124 opened by mayn7z - 1
java.lang.OutOfMemoryError: Java heap space
#1123 opened by meikey - 0
There's a code injection vulnerability of `us.codecraft.webmagic.downloader.PhantomJSDownloader`
#1122 opened by LetianYuan - 0
当使用代理的时候,如果链接代理超时,最后使用Lister监听状态会被算进成功里面,而不是失败!
#1120 opened by konglquan - 7
HttpClientDownloader不要捕获错误而是把错误抛出来
#1094 opened by keatonLiu - 1
The domain webmagic.io has expired
#1116 opened by x6770 - 1
- 0
xpath匹配标签使用或后得到的结果集不是按顺序出现
#1115 opened by wanygan83 - 1
爬Twitter遇到JavaScript 不可用问题。
#1113 opened by wayss000 - 1
webmagic0.75,0.8.0 RedisScheduler无法使用
#1111 opened by Bertluo - 0
想要获取到最后一次重定向到的最终url
#1110 opened by keatonLiu - 1
No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
#1109 opened by ChenSino - 1
- 1
Can Playwright be supported
#1106 opened by holmofy - 1
如何在PageProcessor的process里面实现点击操作?
#1102 opened by Mr-LiuDC - 0
请求支持JQuery遍历API
#1105 opened by w3l7 - 5
待爬取的链接数正常,但爬取结束后的结果数和链接数不一致
#1104 opened by w3l7 - 6
使用setCharSet()后无法自动推测网页编码,导致网页乱码
#1101 opened by keatonLiu - 0
可不可以和scrapy一样,对每个url定义不同的请求方式和参数等
#1103 opened by keatonLiu - 4
javax.net.ssl.SSLHandshakeException: Received fatal alert: protocol_version
#1097 opened by keatonLiu - 0
建议addTargetRequests方法支持所有Iterable<String>
#1099 opened by keatonLiu - 0
Integrate URLFrontier as a backend for URL storage
#1098 opened by jnioche - 3
某些情况下爬虫会莫名其妙卡住不动,但状态是Running
#1096 opened by keatonLiu - 1
Processor中使用page.putField()保存对象数组,数据量较大时,没进Pipeline里
#1093 opened by Golne - 0
CountableThreadPool 的意义何在
#1092 opened by MaLuxray - 3
修改WebDriverPool源码指定ChromeOptions出现org.openqa.selenium.chrome.ChromeOptions.addArguments([Ljava/lang/String;)Lorg/openqa/selenium/chrome/ChromeOptions;
#1091 opened by 694475668 - 1
core包与springboot冲突
#1090 opened by MaLuxray - 1
downloader 设置 proxy 与 site 设置 proxy 有区别吗?
#1081 opened by nesteiner - 0
downloader 设置 proxy 与 site 设置 proxy 有区别吗?
#1080 opened by nesteiner - 2
使用代理IP抓取,报http请求 432错误,问题已修复,请改源代码
#1079 opened by song51930 - 1
下载错误的页面可以再重新放回任务队列吗?为什么errorPage统计出的错误页面数一直都没变
#1078 opened by patience00 - 5
Failed to load class "org.slf4j.impl.StaticLoggerBinder".
#1074 opened by songsh - 4
Cannot compile project
#1077 opened by alexandrujecan - 0
pojo如何动态匹配
#1076 opened by cyb-start - 1
如何 模拟点击事件 抓包
#1073 opened by worldstearly - 1
無法抓取網站 出現timeout
#1070 opened by ryan701212