BruceDone/awesome-crawler

Add Mercator URL Frontier implementation

moredure opened this issue · 5 comments

Is this a framework ? @moredure

Hello @BruceDone, I have some ideas to wrap it into a framework, but generally it is just a reference implementation of Mercator Crawler URL Frontier in Golang, which allows to conduct smart (calculate windows between requests to the same host based on previous request duration to the same host or some static values, etc) and polite crawling as well as to enable crawling without memory limitations for URL queues both seed and collected during crawling.

Should it be a complete framework to pass your moderation @BruceDone , or just a useful reference will be enought?

Thanks for your contribute, but this repo just collect the crawler framework

Understood, thanks for your reply.