postmodern/spidr

99% of cpu usage while crawling bigger websites...

nu7hatch opened this issue · 1 comments

Hello, when i'm crawling quite big websites (with :depth => 5) spidr eats all my server's resources o_O". I'm using it in many threads (within eventmachine) so I think it can be something wrong with threads safety. Any ideas?

Spidr is currently not thread-safe, so that is probably the issue. You could run multiple Spidr::Agents per thread?

In the future I want to add a Thread Pool, and queue up multiple HTTP Requests.