MikeMeliz/TorCrawl.py

Implement IP Rotation.

the-siegfried opened this issue · 0 comments

Is your feature request related to a problem? Please describe.
Currently the application which serves as a Crawler/Extractor only supports connectivity to either the Clearnet or Tor network using the localhost address over a sock5 proxy, which obtains a fixed address to communicate over. In order to improve anonymity and the service the project provides the capability to rotate IP addresses ought to be supported.

Describe the solution you'd like

  • Refactor the implementation of the connect_tor() method in order to support privoxy and proxy rotation.
  • Implementation for proxy rotation support of clearnet crawling.

Additional context
Modern web applications also tend to be supported be Web Application Firewalls (WAFs) and other technologies which can detect crawlers and bots and defer or block assess to the site. By rotating IPs we are consciously evading these detection and mitigating controls as to not disrupt the applications core service.