Regarding feat: implement unthrottled concurrency using task queue
wumpus opened this issue · 8 comments
Can you stop attacking the Common Crawl CDX API?
I’m not? This is an open source tool to find archived URLs for a given domain…
Yes, and because it isn't throttled, use of this package harms the target, which is me.
Any progress? I was hoping for rate limiting, honoring 503 and 429 status codes, and exponential backoff.
And not just "unthrottled concurrency".
It’s open source, so PR's are welcome.
It is going to be a busy month with some life changes for me – I will put this in my TODO's. Unfortunately will likely not get done until late June or early July
Accidentally closed when commenting
Thanks for adding to your TODO list, I appreciate it!
Here's an example of making a single query in Athena that's much more efficient than gau: https://positive.security/blog/ransack-data-exfiltration#common-crawl
Thanks for the reference & sorry about the slowness to implement. Getting hitched!
Congratulations!