Ignore “ignore”
elmimmo opened this issue · 4 comments
elmimmo commented
Is there a way to ignore whatever criteria is being used to ignore URLs and have those in the sitemap too?
elmimmo commented
Some URLs are ignored if their source code have a line similar to this one:
<meta name="robots" content="noindex , nofollow" />
(which IMHO sitemap-generator-cli should ignore when run with the option --no-respect-robots-txt
).
Comment out line 102 in /usr/local/lib/node_modules/sitemap-generator-cli/index.js in order to include those URLs in the sitemap too, like so:
// /(<meta(?=[^>]+noindex).*?>)/.test(page) || // check if robots noindex is present
rulatir commented
The workaround no longer works - something else is at line 102 now and "noindex" does not occur anywhere in the file.