yujiosaka/headless-chrome-crawler

Shuffle queue

davidebaldini opened this issue · 3 comments

Is it possible to shuffle / randomize the order of URLs pending in queue?
This is not discussed in the API documentation.

I don't think there's a way to do it directly from the crawler. Perhaps you could do it using the redis cache, and most certainly you could do it by coding some sort of crawl-priority random algorith.

@davidebaldini did you consider to run several browser instances in parallel? each one could pick up a target URL and crawl through subsequent links.

closing due to incactivity