CherioCrawler not working "allow running single crawler instance multiple times"
distributev opened this issue · 0 comments
Which package is this bug report for? If unsure which one to select, leave blank
@crawlee/cheerio (CheerioCrawler)
Issue description
I believe this is expected to work but it does not
allow running single crawler instance multiple times
#1844
If I try to run() in a loop the first iteration works fine but all the subsequent iterations display.
2024-08-26T22:00:05.502Z INFO CheerioCrawler: Starting the crawler.
2024-08-26T22:00:05.576Z INFO CheerioCrawler: All requests from the queue have been processed, the crawler will shut down.
2024-08-26T22:00:05.783Z INFO CheerioCrawler: Final request statistics: {"requestsFinished":0,"requestsFailed":0,"retryHistogram":[],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":451}
Code sample
for (let i=0;i<100;i++) {
console.time(`RUN (${i}) crawler.run`);
await crawler.run(urls);
await new Promise(resolve => setTimeout(resolve, 1000));
console.timeLog(`RUN (${i}) crawler.run`);
}
### Package version
3.11.1
### Node.js version
20
### Operating system
_No response_
### Apify platform
- [X] Tick me if you encountered this issue on the Apify platform
### I have tested this on the `next` release
_No response_
### Other context
_No response_