[BUG]
dirtybull opened this issue · 1 comments
Description
Hi, thanks for the amazing tool. When I playing around using 'commoncrawl' as the source I found it sometimes can print out urls but sometimes can't. So I modified the code to print out the errors alongside then I found commoncrawl returned tons of 503 and timeout
Steps To Reproduce
./sigurlfind3r -d tesla.com --include-subs -uS commoncrawl
Additional context
I tested over my home network as well as a VPS located on Los Angeles.
I changed a bit of the code to avoid parallel processing and increased the timeout to 60s from 10s. Then it worked as expected.
I haven't found any official documented rate limit of commoncrawl though, but reliability is the top concern in my opinion, especially in terms of automation. Just for your reference and thanks again for your work :)
Thanks @dirtybull, I will look into it