A tiny web spider that starts crawling a website and crawls as long as it can find links on those pages, which links to similar spam pages.
This spider is targeting the 'Untitled' spam pages from the Google search results.
I wrote several articles about those spam pages. In which I discuss the underlying backgrounds of this spam network.
I crawled 105,009 Google 'Untitled' Spam Pages in 7 days and 700,504 other linked Spam Pages
— David Wolf
david.wolf.gdn
david.wolf.gdn
from google_spam_spider import GoogleSpamSpider
spider = GoogleSpamSpider(
url='http://zone-casino.fr/2hephe/torch-functional-unfold.html', # The url to start crawling
direct_spam_logs='direct_spam.log', # The file to log direct spam
external_spam_logs='external_spam.log' # The file to log external spam
)