This will go around in circles crawling the same pages over and over.
superlowburn opened this issue · 3 comments
superlowburn commented
^^^
lgraubner commented
Please provide an example or describe the problem further. The crawler should ignore already fetched pages.
superlowburn commented
Hi,
./sitemap-generator -bq gbiz.org sitemap.xml --verbose
I repeatedly get the same URLs being crawled.
Too many to cutnpaste here.
lgraubner commented
Started a crawl and looks like it does what it should do. It only adds pages a single time. Could you paste an example URL which is added more than once?
Also you don't need the -b
flag if you are entry is the home page. It matches all pages anyways.